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A NOVEL HUMAN VIRUS CAUSING 
SEVERE ACUTE RESPIRATORY SYNDROME (SARS) AND USES 
THEREOF 

This application claims priority benefit to U.S. provisional application no. 
60/457,03 1, filed March 24, 2003; U.S. provisional application no. 60/457,730, filed March 
26, 2003; U.S. provisional application no. 60/459,931, filed April 2, 2003; U.S. provisional 
application no. 60/460,357, filed April 3, 2003; U.S. provisional application no. 60/461,265, 
filed April 8, 2003; U.S. provisional application no. 60/462,805, filed April 14, 2003; and 
U.S. provisional application no. 60/464,886 filed April 23, 2003, each of which is 
incorporated herein by reference in its entirety. 

The instant application contains a lengthy Sequence Listing which is being 
concurrently submitted via triplicate CD-R in lieu of a printed paper copy, and is hereby 
incorporated by reference in its entirety. Said CD-R, recorded on March 16, 2004, are 
labeled "CRT", "Copy 1" and "Copy 2", respectively, and each contains only one identical 
1.58 MB file (V9661069.APP). 

1. INTRODUCTION 

The present invention relates to an isolated novel virus causing Severe Acute 
Respiratory Syndrome (SARS) in humans ("hSARS virus"). The hSARS virus is identified 
to be morphologically and phylogenetically similar to known members of Coronaviridae. 
The present invention relates to a nucleotide sequence comprising the complete genomic 
sequence of the hS ARS virus. The invention further relates to nucleotide sequences 
comprising a portion of the genomic sequence of the hSARS virus. The invention also 
relates to the deduced amino acid sequences of the complete genome of the hSARS virus. 
The invention further relates to the nucleic acids and peptides encoded by and/or derived 
from these sequences and their use in diagnostic methods and therapeutic methods, such as 
for immunogens. The invention further encompasses chimeric or recombinant viruses 
encoded by said nucleotide sequences and antibodies directed against polypeptides encoded 
by the nucleotide sequence. Furthermore, the invention relates to vaccine preparations 
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comprising the hSARS virus, including recombinant and chimeric forms of said virus as 
well as protein extracts and subunits of said virus. 

2. BACKGROUND OF THE INVENTION 

5 Recently, there has been an outbreak of atypical pneumonia in Guangdong province 

in mainland China. Between November 2002 and March 2003, there were 792 reported 
cases with 31 fatalities (WHO. Severe Acute Respiratory Syndrome (SARS) Weekly 
Epidemiol Rec. 2003; 78: 86). In response to this crisis, the Hospital Authority in Hong 
Kong has increased the surveillance on patients with severe atypical pneumonia. In the 

10 course of this investigation, a number of clusters of health care workers with the disease 
were identified. In addition, there were clusters of pneumonia incidents among persons in 
close contact with those infected. The disease was unusual in its severity and its 
progression in spite of the antibiotic treatment typical for the bacterial pathogens that are 
known to be commonly associated with atypical pneumonia. The present inventors were 

15 one of the groups involved in the investigation of these patients. All tests for identifying 

commonly recognized viruses and bacteria were negative in these patients. The disease was 
given the acronym Severe Acute Respiratory Syndrome ("SARS"). The etiologic agent 
responsible for this disease was not known until the isolation of hSARS virus from the 
SARS patients by the present inventors as disclosed herein. Namely, the present invention 

20 discloses a novel human virus that has been isolated and identified from the patients 
suffering from SARS. The invention is useful in both clinical and scientific research 
applications. 

3. SUMMARY OF INVENTION 

25 The present invention is based upon the inventor's isolation and identification of a 

novel virus causing Severe Acute Respiratory Syndrome in humans ("hSARS virus"). The 
virus was isolated from the patients suffering from SARS in the recent outbreak of severe 
atypical pneumonia in China. The isolated virus is an enveloped, single-stranded RNA 
virus of positive polarity which belongs to the order, Nidovirales, of the family, 

30 Coronaviridae. Accordingly, the invention relates to the isolated hSARS virus that 
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morphologically and phylogenetically relates to known members of Coronaviridae. In a 
specific embodiment, the isolated hSARS virus is that which was deposited with China 
Center for Type Culture Collection (CCTCC) on April 2, 2003 and accorded an accession 
number, CCTCC- V2003 03 , as described in Section 7, infra. In another specific 
5 embodiment, the invention provides complete genomic sequence of the hSARS virus. In a 
preferred embodiment, the virus comprises a nucleotide sequence of SEQ ID NO: 15. In 
another specific embodiment, the invention provides nucleic acids isolated from the virus. 
The virus preferably comprises a nucleotide sequence of SEQ ID NO: 1, 1 1 and/or 13 in its 
genome. In a specific embodiment, the present invention provides isolated nucleic acid 

10 molecules comprising or, alternatively, consisting of the nucleotide sequence of SEQ ID 

NO: 1, a complement thereof or a portion thereof, preferably at least 5, 10, 15, 20, 25, 30, 35, 
40, 45, 100, 150, 200, 300, 350, 400, 450, 500, 550, 600, or more contiguous nucleotides of 
the nucleotide sequence of SEQ ID NO: 1, or a complement thereof In another specific 
embodiment, the present invention provides isolated nucleic acid molecules comprising or, 

15 alternatively, consisting of the nucleotide sequence of SEQ ID NO: 1 1, a complement 

thereof or a portion thereof, preferably at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 100, 150, 
200, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1,000, 1,050, 
1,100, 1,150, 1,200, or more contiguous nucleotides of the nucleotide sequence of SEQ ID 
NO: 1 1, or a complement thereof. In yet another specific embodiment, the present invention 

20 provides isolated nucleic acid molecules comprising or, alternatively, consisting of the 
nucleotide sequence of SEQ ID NO: 13, a complement thereof or a portion thereof, 
preferably at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 100, 150, 200, 300, 350, 400, 450, 500, 
550, 600, 650, 700, or more contiguous nucleotides of the nucleotide sequence of SEQ ID 
NO: 13, or a complement thereof In another specific embodiment, the present invention 

25 provides isolated nucleic acid molecules comprising or, alternatively, consisting of the 
nucleotide sequence of SEQ ID NO: 15, a complement thereof or a portion thereof, 
preferably at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 100, 150, 200, 300, 350, 400, 450, 500, 
550, 600, 650, 700, 750, 800, 850, 900, 950, 1,000, 1,050, 1,100, 1,150, 1,200, 2,000, 3,000, 
4,000, 5,000, 6,000, 7,000, 8,000, 9,000, 10,000, 11,000, 12,000, 13,000, 14,000, 15,000, 

30 16,000, 17,000, 18,000, 19,000, 20,000, 21,000, 22,000, 23,000, 24,000, 25,000, 26,000, 
27,000, 28,000, 29,000 or more contiguous nucleotides of the nucleotide sequence of SEQ 
ID NO: 15, or a complement thereof. Furthermore, in another specific embodiment, the 
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invention provides isolated nucleic acid molecules which hybridize under stringent 
conditions, as defined herein, to a nucleic acid molecule having the sequence of SEQ ID 
NO:l, 11, 13, 15, 16, 240, 737, 1108, 1590 or 1965 or a complement thereof. In one 
embodiment, the invention provides an isolated nucleic acid molecule which is antisense to 
5 the coding strand of a nucleic acid of the invention. In another specific embodiment, the 
invention provides isolated polypeptides or proteins that are encoded by a nucleic acid 
molecule comprising or, alternatively consisting of a nucleotide sequence that is at least 5, 
10, 15, 20, 25, 30, 35, 40, 45, 100, 150, 200, 300, 350, 400, 450, 500, 550, 600, or more 
contiguous nucleotides of the nucleotide sequence of SEQ ID NO: 1, or a complement 

10 thereof. In yet another specific embodiment, the invention provides isolated polypeptides or 
proteins that are encoded by a nucleic acid molecule comprising or, alternatively consisting 
of a nucleotide sequence that is at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 100, 150, 200, 300, 
350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1,000, 1,050, 1,100, 1,150, 
1,200 or more contiguous nucleotides of the nucleotide sequence of SEQ ID NO: 1 1, or a 

15 complement thereof. In yet another specific embodiment, the invention provides isolated 
polypeptides or proteins that are encoded by a nucleic acid molecule comprising or, 
alternatively consisting of a nucleotide sequence that is at least 5, 10, 15, 20, 25, 30, 35, 40, 
45, 100, 150, 200, 300, 350, 400, 450, 500, 550, 600, 650, 700, or more contiguous 
nucleotides of the nucleotide sequence of SEQ ID NO: 13, or a complement thereof. In yet 

20 another specific embodiment, the invention provides isolated polypeptides or proteins that 
are encoded by a nucleic acid molecule comprising or, alternatively consisting of a 
nucleotide sequence that is at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 100, 150, 200, 300, 350, 
400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1,000, 1,050, 1,100, 1,150, 
1,200, 2,000, 3,000, 4,000, 5,000, 6,000, 7,000, 8,000, 9,000, 10,000, 11,000, 12,000, 

25 13,000, 14,000, 15,000, 16,000, 17,000, 18,000, 19,000, 20,000, 21,000, 22,000, 23,000, 
24,000, 25,000, 26,000, 27,000, 28,000, 29,000 or more contiguous nucleotides of the 
nucleotide sequence of SEQ ID NO: 15, or a complement thereof. The invention further 
provides proteins or polypeptides that are isolated from the hSARS virus, including viral 
proteins isolated from cells infected with the virus but not present in comparable uninfected 

30 cells. The invention further provides proteins or polypeptides of SEQ ID NOS:2, 12 and 14 
and those shown in Figures 11 (SEQ ID NOS: 17-239, 241-736 and 738-1107) and 12 
(1109-1589, 1591-1964, 1966-2470). The polypeptides or the proteins of the present 
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invention preferably have a biological activity of the protein (including antigenicity and/or 
immunogenicity) encoded by the sequence of SEQ ID NO:l, 11, 13, 16, 240, 737, 1108, 
1590 or 1965. In other embodiments, the polypeptides or the proteins of the present 
invention have a biological activity of the protein (including antigenicity and/or 
5 immunogenicity) encoded by a nucleotide sequence that is at least 5, 10, 15, 20, 25, 30, 35, 
40, 45, 100, 150, 200, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 
1,000, 1,050, 1,100, 1,150, 1,200, 2,000, 3,000, 4,000, 5,000, 6,000, 7,000, 8,000, 9,000, 
10,000, 11,000, 12,000, 13,000, 14,000, 15,000, 16,000, 17,000, 18,000, 19,000, 20,000, 
21,000, 22,000, 23,000, 24,000, 25,000, 26,000, 27,000, 28,000, 29,000 or more contiguous 
1 0 nucleotides of the nucleotide sequence of SEQ ID NO: 1 5, or a complement thereof. In 
other embodiments, the polypeptides or the proteins of the present invention have a 
biological activity of the protein (including antigenicity and/or immunogenicity) of Figures 
11 (SEQ ID NOS: 17-239, 241-736 and 738-1107) and 12 (SEQ ID NOS: 11 09- 1589, 1591- 
1964 and 1966-2470). 

15 In one aspect, the invention provides a method for propagating the hS ARS virus in 

host cells comprising infecting the host cells with the isolated hSARS virus, culturing the 
host cells to allow the virus to multiply, and harvesting the resulting virions. Also provide 
by the present invention are host cells that are infected with the hS ARS virus. In another 
aspect, the invention relates to the use of the isolated hSARS virus for diagnostic and 

20 therapeutic methods. In a specific embodiment, the invention provides a method of 

detecting in a biological sample an antibody immunospecific for the hSARS virus using the 
isolated hSARS virus or any proteins or polypeptides thereof In another specific 
embodiment, the invention provides a method of screening for an antibody which 
immunospecifically binds and neutralizes hSARS. Such an antibody is useful for a passive 

25 immunization or immunotherapy of a subject infected with hSARS. 

The invention further relates to the use of the sequence information of the isolated 
virus for diagnostic and therapeutic methods. In a specific embodiment, the invention 
provides nucleic acid molecules which are suitable for use as primers consisting of or 
comprising the nucleotide sequence of SEQ ID NO: 1, 11, 13, or 15, a complement thereof, 

30 or at least a portion of the nucleotide sequence thereof. In another specific embodiment, the 
invention provides nucleic acid molecules which are suitable for hybridization to hSARS 
nucleic acid, including, but not limited to, as PGR primers, Reverse Transcriptase primers, 
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probes for Southern analysis or other nucleic acid hybridization analysis for the detection of 
hSARS nucleic acids, e.g., consisting of or comprising the nucleotide sequence of SEQ ID 
NO; 1, 11, 1 3 , or 15, a complement thereof, or a portion thereof. The invention further 
encompasses chimeric or recombinant viruses encoded in whole or in part by said 
5 nucleotide sequences. 

The invention further provides antibodies that specifically bind a polypeptide of the 
invention encoded by the nucleotide sequence of SEQ ID NO: 1, 1 1, 13, 16, 240, 737, 1 108, 
1590 or 1965, or a fragment thereof, or encoded by a nucleic acid comprising a nucleotide 
sequence that hybridizes under stringent conditions to the nucleotide sequence of SEQ ID 

10 NO: 1, 1 1, or 13, and/or any hSARS epitope, having one or more biological activities of a 
polypeptide of the invention. The invention further provides antibodies that specifically 
bind polypeptides of the invention encoded by the nucleotide sequence of SEQ ID NO: 15 or 
a complement thereof, or a fragment thereof. These polypeptides include those shown in 
Figures 11 (SEQ ID NOS: 17-23 9, 241-736 and 738-1107) and 12 (SEQ ID NOS: 1109-1589, 

15 1591-1964 and 1966-2470). The invention further provides antibodies that specifically bind 
polypeptides of the invention encoded by a nucleic acid comprising a nucleotide sequence 
that hybridizes under stringent conditions to the nucleotide sequence of SEQ ID NO: 15, 
and/or any hSARS epitope, having one or more biological activities of a polypeptide of the 
invention. Such antibodies include, but are not limited to polyclonal, monoclonal, bi- 

20 specific, multi-specific, human, humanized, chimeric antibodies, single chain antibodies, 
Fab fragments, F(ab') 2 fragments, disulfide-linked Fvs, intrabodies and fragments 
containing either a VL or VH domain or even a complementary determining region (CDR) 
that specifically binds to a polypeptide of the invention. 

In one embodiment, the invention provides methods for detecting the presence, 

25 activity or expression of the hSARS virus of the invention in a biological material, such as 
cells, blood, saliva, urine, and so forth. The increased or decreased activity or expression of 
the hSARS virus in a sample relative to a control sample can be determined by contacting 
the biological material with an agent which can detect directly or indirectly the presence, 
activity or expression of the hSARS virus. In a specific embodiment, the detecting agents 

30 are the antibodies or nucleic acid molecules of the present invention. Antibodies of the 
invention may also be used to treat SARS. 
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In another embodiment, the invention provides vaccine preparations, comprising the 
hS ARS virus, including recombinant and chimeric forms of said virus, or protein subunits 
of the virus. In a specific embodiment, the vaccine preparations of the present invention 
comprise live but attenuated hSARS virus with or without adjuvants. In another specific 
5 embodiment, the vaccine preparations of the invention comprise an inactivated or killed 
hS ARS virus. Such attenuated or inactivated viruses may be prepared by a series of 
passages of the virus through the host cells or by preparing recombinant or chimeric forms 
of virus. Accordingly, the present invention further provides methods of preparing 
recombinant or chimeric forms of hSARS. In another specific invention, the vaccine 

1 0 preparations of the present invention comprise a nucleic acid or fragment of the hS ARS 
virus, e.g., the virus having accession no. CCTCC-V200303, or nucleic acid molecules 
having the sequence of SEQ ID NO. 1, 1 1, 13, or 15, or a fragment thereof In another 
embodiment, the invention provides vaccine preparations comprising one or more 
polypeptides isolated from or produced from nucleic acid of hSARS virus, for example, of 

15 deposit accession no. CCTCC-V200303. In a specific embodiment, the vaccine 

preparations comprise a polypeptide of the invention encoded by the nucleotide sequence of 
SEQ ED NO: 1, 11, 13, 16,240,737, 1108, 1590 or 1965, or a fragment thereof. In a 
specific embodiment, the vaccine preparations comprise polypeptides of the invention as 
shown in Figures 1 1 (SEQ ID NOS: 17-239, 241-736 and 738-1 107) and 12 (SEQ ID 

20 NOS:l 109-1589, 1591-1964 and 1966-2470) or encoded by the nucleotide sequence of SEQ 
ID NO: 15, or a fragment thereof. Furthermore, the present invention provides methods for 
treating, ameliorating, managing or preventing S ARS by administering the vaccine 
preparations or antibodies of the present invention alone or in combination with adjuvants, 
or other pharmaceutically acceptable excipients. 

25 In another aspect, the present invention provides pharmaceutical compositions 

comprising anti-viral agents of the present invention and a pharmaceutically acceptable 
carrier. In a specific embodiment, the anti-viral agent of the invention is an antibody that 
immunospecifically binds hSARS virus or any hSARS epitope. In another specific 
embodiment, the anti-viral agent is a polypeptide or protein of the present invention or 

30 nucleic acid molecule of the invention. The invention also provides kits containing a 
pharmaceutical composition of the present invention. 
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3.1 Definitions 

The term "an antibody or an antibody fragment that immunospecifically binds a 
polypeptide of the invention" as used herein refers to an antibody or a fragment thereof that 
immunospecifically binds to the polypeptide encoded by the nucleotide sequence of SEQ ID 
5 NO: 1, 1 1, 13 or 15, or a fragment thereof, and does not non-specifically bind to other 
polypeptides. An antibody or a fragment thereof that immunospecifically binds to the 
polypeptide of the invention may cross-react with other antigens. Preferably, an antibody or 
a fragment thereof that immunospecifically binds to a polypeptide of the invention does not 
cross-react with other antigens. An antibody or a fragment thereof that immunospecifically 

10 binds to the polypeptide of the invention, can be identified by, for example, immunoassays 
or other techniques known to those skilled in the art. 

An "isolated" or "purified" peptide or protein is substantially free of cellular material 
or other contaminating proteins from the cell or tissue source from which the protein is 
derived, or substantially free of chemical precursors or other chemicals when chemically 

15 synthesized. The language "substantially free of cellular material" includes preparations of 
a polypeptide/protein in which the polyp eptide/protein is separated from cellular 
components of the cells from which it is isolated or recombinantly produced. Thus, a 
polypeptide/protein that is substantially free of cellular material includes preparations of the 
polypeptide/protein having less than about 30%, 20%, 10%, 5%, 2.5%, or 1%, (by dry 

20 weight) of contaminating protein. When the polypeptide/protein is recombinantly produced, 
it is also preferably substantially free of culture medium, i.e., culture medium represents 
less than about 20%, 10%, or 5% of the volume of the protein preparation. When 
polypeptide/protein is produced by chemical synthesis, it is preferably substantially free of 
chemical precursors or other chemicals, i.e., it is separated from chemical precursors or 

25 other chemicals which are involved in the synthesis of the protein. Accordingly, such 

preparations of the polypeptide/protein have less than about 30%o, 20%, 10%, 5% (by dry 
weight) of chemical precursors or compounds other than polypeptide/protein fragment of 
interest. In a preferred embodiment of the present invention, polypeptides/proteins are 
isolated or purified. 

30 An "isolated" nucleic acid molecule is one which is separated from other nucleic 

acid molecules which are present in the natural source of the nucleic acid molecule. 
Moreover, an "isolated" nucleic acid molecule, such as a cDNA molecule, can be 
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substantially free of other cellular material, or culture medium when produced by 
recombinant techniques, or substantially free of chemical precursors or other chemicals 
when chemically synthesized. In a preferred embodiment of the invention, nucleic acid 
molecules encoding polypeptides/proteins of the invention are isolated or purified. The 
5 term "isolated" nucleic acid molecule does not include a nucleic acid that is a member of a 
library that has not been purified away from other library clones containing other nucleic 
acid molecules. 

The term "portion" or "fragment" as used herein refers to a fragment of a nucleic 
acid molecule containing at least about 25, 30, 35, 40, 45, 100, 150, 200, 250, 300, 350, 400, 

10 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, 
1300, 1350, 2,000, 3,000, 4,000, 5,000, 6,000, 7,000, 8,000, 9,000, 10,000, 11,000, 12,000, 
13,000, 14,000, 15,000, 16,000, 17,000, 18,000, 19,000, 20,000, 21,000, 22,000, 23,000, 
24,000, 25,000, 26,000, 27,000, 28,000, 29,000, or more contiguous nucleic acids in length 
of the relevant nucleic acid molecule and having at least one functional feature of the 

1 5 nucleic acid molecule (or the encoded protein has one functional feature of the protein 
encoded by the nucleic acid molecule); or a fragment of a protein or a polypeptide 
containing at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 90, 100, 120, 
140, 160, 180, 200, 220, 240, 260, 280, 300, 320, 340, 360, 400, 500, 600, 700, 800, 900, 
1,000, 1,500, 2,000, 2,500, 3,000, 3,500, 4,000, 4,100, 4,200, 4,300, 4,350, 4,360, 4,370, 

20 4,380 amino acid residues in length of the relevant protein or polypeptide and having at 
least one functional feature of the protein or polypeptide. 

The term "having a biological activity of the protein" or "having biological activities 
of the polypeptides of the invention" refers to the characteristics of the polypeptides or 
proteins having a common biological activity similar or identical structural domain and/or 

25 having sufficient amino acid identity to the polypeptide encoded by the nucleotide sequence 
of SEQ IDNO:l, 11, 13, 15, 16, 240, 737, 1108, 1590 or 1965. Such common biological 
activities of the polypeptides of the invention include antigenicity and immunogenicity. 

The term "under stringent condition" refers to hybridization and washing conditions 
under which nucleotide sequences having at least 70%, at least 75%, at least 80%, at least 

30 85%, at least 90%, or at least 95% identity to each other remain hybridized to each other. 
Such hybridization conditions are described in, for example but not limited to, Current 
Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6.; Basic 
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Methods in Molecular Biology, Elsevier Science Publishing Co., Inc., N.Y. (1986), pp. 75- 
78, and 84-87; and Molecular Cloning, Cold Spring Harbor Laboratory, N.Y. (1982), pp. . 
387-389, and are well known to those skilled in the art. A preferred, non-limiting example 
of stringent hybridization conditions is hybridization in 6X sodium chloride/sodium citrate 
5 (SSC), 0.5% SDS at about 68°C followed by one or more washes in 2X SSC, 0.5% SDS at 
room temperature. Another preferred, non-limiting example of stringent hybridization 
conditions is hybridization in 6X SSC at about 45°C followed by one or more washes in 
0.2X SSC, 0.1% SDS at about 50-65°C. 

The term "variant" as used herein refers either to a naturally occurring genetic 
10 mutant of hSARS or a recombinantly prepared variation of hSARS each of which contain 
one or more mutations in its genome compared to the hSARS of CCTCC-V200303. The 
term "variant" may also refers either to a naturally occurring variation of a given peptide or 
a recombinantly prepared variation of a given peptide or protein in which one or more 
amino acid residues have been modified by amino acid substitution, addition, or deletion. 

15 

4. DESCRIPTION OF THE FIGURES 

Figure 1 shows a partial DNA sequence (SEQ ID NO;l) and its deduced amino acid 
sequence (SEQ ID NO: 2) obtained from the SARS virus that has 57% homology to the 
RNA-dependent RNA polymerase protein of known Cor oncoviruses. 
20 Figure 2 shows an electron micrograph of the novel hSARS virus that has similar 

morphological characteristics of coronaviruses. 

Figure 3 shows an immunofluorescent staining for IgG antibodies that are 
specifically bound to the FrHfC-4 cells infected with the novel human respiratory virus of 
Coronaviridae . 

25 Figure 4 shows an electron micrograph of ultra-centrifuged deposit of hSARS virus 

that was grown in the cell culture and negatively stained with 3% potassium phospho- 
tungstate at pH 7.0. 

Figure 5 A shows a thin-section electron micrograph of lung biopsy of a patient with 
SARS; and Figure 5B shows a thin section electron micrograph of hSARS -infected cells. 
30 Figure 6 shows the result of phylogenetic analysis for the partial protein sequence 

(215 amino acids; SEQ ID NO:2) of the hSARS virus (GenBank accession number 

10 
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AY268070). The phylogenetic tree is constructed by the neighbor-jointing method. The 
horizontal-line distance represents the number of sites at which the two sequences compared 
are different. Bootstrap values are deducted from 500 replicates. 

Figure 7 A shows an amplification plot of fluorescence intensity against the PGR 
5 cycle in a real-time quantitative PGR assay that can detect a hSARS virus in samples 

quantitatively. The copy numbers of input plasmid DNA in the reactions are indicated. The 
X-axis denotes the cycle number of a quantitative PGR assay and the Y-axis denotes the 
fluorescence intensity (FI) over the backgroud. Figure 7B shows the result of a melting 
curve analysis of PGR products from clinical samples. Signals from positive (+ve) samples, 
10 negative (-ve) samples and water control (water) are indicated. The X-axis denotes the 
temperature (°C) and the Y-axis denotes the fluorescence intensity (Fl) over the 
background. 

Figure 8 shows another partial DNA sequence (SEQ ID NO: 11) and its deduced 
amino acid sequence (SEQ ID NO: 12) obtained from the SARS virus. 
15 Figure 9 shows yet another partial DNA sequence (SEQ ED NO: 13) and its deduced 

amino acid sequence (SEQ ID NO: 14) obtained from the SARS virus. 

Figure 10 shows the entire genomic DNA sequence (SEQ ID NO: 15) of the SARS 

virus. 

Figure 1 1 shows the deduced amino acid sequences obtained from SEQ ID NO: 15 in 
20 three frames (see SEQ ID NOS:16 ? 240 and 737). An asterisk (*) indicates a stop codon 

which marks the end of a peptide. The first-frame amino acid sequences: SEQ ID NOS:17- 
239; the second-frame amino acid sequences: SEQ ID NOS:241-736; and the third-frame 
amino acid sequences: SEQ IDNO:738-1107. 

Figure 12 shows the deduced amino acid sequences obtained from the complement 
25 of SEQ ID NO: 15 in three frames (jee SEQ ID NOS:l 108, 1590 and 1965). An asterisk (*) 
indicates a stop codon which marks the end of a peptide. The first-frame amino acid 
sequences: SEQ ID NOS: 1109-1589; the second-frame amino acid sequences: SEQ ID 
NOS:1591-1964; and the third-frame amino acid sequences: SEQ ID NO:1966-2470. 

30 5. DETAILED DESCRIPTION OF THE INVENTION 

11 
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The present invention relates to the isolated hSARS virus that morphologically and 
phylogenetically relates to known Coronaviruses. In a specific embodiment, the isolated 
hSARS virus is that of CCTCC-V200303. In another specific embodiment, the virus 
comprises a nucleotide sequence of SEQ ID NO:l, 11, 13, and/or 15. In a specific 
5 embodiment, the present invention provides isolated nucleic acid molecules of the hSARS 
virus, comprising, or, alternatively, consisting of the nucleotide sequence of SEQ ID NO:l, 
11, 13, and/or 15, a complement thereof or a portion thereof. In another specific 
embodiment, the invention provides isolated nucleic acid molecules which hybridize under 
stringent conditions, as defined herein, to a nucleic acid molecule having the sequence of 

10 SEQ ID NO: 1, 11, 13, or 15, or specific genes of known member of Coronaviridae, or a 
complement thereof. In another specific embodiment, the invention provides isolated 
polypeptides or proteins that are encoded by a nucleic acid molecule comprising a 
nucleotide sequence that is at least about 5, 10, 15, 20, 25, 30, 35, 40, 45, 100, 150, 200, 
300, 350, 400, 450, 500, 550, 600, or more contiguous nucleotides of the nucleotide 

1 5 sequence of SEQ ID NO: 1, or a complement thereof In another specific embodiment, the 
invention provides isolated polypeptides or proteins that are encoded by a nucleic acid 
molecule comprising a nucleotide sequence that is at least about 5, 10, 15, 20, 25, 30, 35, 40, 
45, 100, 150, 200, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 
1,000, 1,050, 1,100, 1,150, 1,200, or more contiguous nucleotides of the nucleotide 

20 sequence of SEQ ID NO: 1 1, or a complement thereof. In yet another specific embodiment, 
the invention provides isolated polypeptides or proteins that are encoded by a nucleic acid 
molecule comprising a nucleotide sequence that is at least about 5, 10, 15, 20, 25, 30, 35, 40, 
45, 100, 150, 200, 300, 350, 400, 450, 500, 550, 600, 650, 700, or more contiguous 
nucleotides of the nucleotide sequence of SEQ ID NO: 13, or a complement thereof. In yet 

25 another specific embodiment, the invention provides isolated polypeptides or proteins that 
are encoded by a nucleic acid molecule comprising or, alternatively consisting of a 
nucleotide sequence that is at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 100, 150, 200, 300, 350, 
400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1,000, 1,050, 1,100, 1,150, 
1,200, 2,000, 3,000, 4,000, 5,000, 6,000, 7,000, 8,000, 9,000, 10,000, 11,000, 12,000, 

30 13,000, 14,000, 15,000, 16,000, 17,000, 18,000, 19,000, 20,000, 21,000, 22,000, 23,000, 
24,000, 25,000, 26,000, 27,000, 28,000, 29,000 or more contiguous nucleotides of the 
nucleotide sequence of SEQ ID NO: 15, or a complement thereof. The polypeptides include 
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those shown in Figures 1 1 (SEQ ID NOS: 17-239, 241-736 and 738-1 107) and 12 (SEQ ID 
NOS:1109-1589, 1591-1964 and 1966-2470). The polypeptides or the proteins of the 
present invention preferably have one or more biological activities of the proteins encoded 
by the sequence of SEQ ID NO: 1, 1 1, 13, 15, or the native viral proteins containing the 
amino acid sequences encoded by the sequence of SEQ ID NO: 1, 1 1, 13, or 15, or those 
shown in Figures 11 (SEQ ID NOS: 17-239, 241-736 and 738-1107) and 12 (SEQ 
IDNOS: 1109-1589, 1591-1964 and 1966-2470). . 

The present invention also relates to a method for propagating the hS ARS virus in 
host cells. 

The invention further relates to the use of the sequence information of the isolated 
virus for diagnostic and therapeutic methods. In a specific embodiment, the invention 
provides the entire nucleotide sequence of hSARS virus, CCTCC-V200303, SEQ ID NO: 15, 
or fragments, or complement thereof Furthermore, the present invention relates to a 
nucleic acid molecule that hybridizes any portion of the genome of the hSARS virus, 
CCTCC-V200303, SEQ ID NO: 15, under the stringent conditions. In a specific 
embodiment, the invention provides nucleic acid molecules which are suitable for use as 
primers consisting of or comprising the nucleotide sequence of SEQ ID NO: 1, 1 1, 13, or 15, 
or a complement thereof, or a portion thereof In a non-limiting embodiment, the invention 
provides the primers consisting of or comprising the nucleotide sequence of SEQ ID NOS:3 
and/or 4. In another specific embodiment, the invention provides nucleic acid molecules 
which are suitable for use as hybridization probes for the detection of nucleic acids 
encoding a polypeptide of the invention, consisting of or comprising the nucleotide 
sequence of SEQ ID NO: 1, 11, 13, or 15, a complement thereof, or a portion thereof. The 
invention further encompasses chimeric or recombinant viruses or viral proteins encoded by 
said nucleotide sequences. 

The invention further provides antibodies that specifically bind a polypeptide of the 
invention encoded by the nucleotide sequence of SEQ ID NO:l, 1 1, 13, 16, 240, 737, 1 108, 
1590 or 1965, or a fragment thereof, or any hSARS epitope. The invention further provides 
antibodies that specifically bind the polypeptides of the invention encoded by the nucleotide 
sequence of SEQ ID NO: 15, or a fragment thereof, or any hSARS epitope. Such antibodies 
include, but are not limited to polyclonal, monoclonal, bi-specific, multi-specific, human, 
humanized, chimeric antibodies, single chain antibodies, Fab fragments, F(ab') 2 fragments, 
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disulfide-linked Fvs, intrabodies and fragments containing either a VL or VH domain or 
even a complementary determining region (CDR) that specifically binds to a polypeptide of 
the invention. 

In one embodiment, the invention provides methods for detecting the presence, 
5 activity or expression of the hSARS virus of the invention in a biological material, such as 
cells, blood, saliva, urine, sputum, nasopharyngeal aspirates, and so forth. The presence of 
the hSARS virus in a sample can be determined by contacting the biological material with 
an agent which can detect directly or indirectly the presence of the hS ARS virus. In a 
specific embodiment, the detection agents are the antibodies of the present invention. In 

10 another embodiment, the detection agent is a nucleic acid of the present invention. 

In another embodiment, the invention provides vaccine preparations comprising the 
hS ARS virus, including recombinant and chimeric forms of said virus, or subunits of the 
virus. In a specific embodiment, the vaccine preparations comprise live but attenuated 
hSARS virus with or without pharmaceutical^ acceptable carriers, including adjuvants. In 

15 another specific embodiment, the vaccine preparations comprise an inactivated or killed 
hSARS virus with or without pharmaceutically acceptable carriers, including adjuvants. 

The present invention further provides methods of preparing recombinant or 
chimeric forms of hSARS. In another specific invention, the vaccine preparations of the 
present invention comprise one or more nucleic acid molecules comprising or consisting of 

20 the sequence of SEQ ID NO. 1, 1 1, 13, and/or, 15, or a fragment thereof. In another 
embodiment, the invention provides vaccine preparations comprising one or more 
polypeptides of the invention encoded by a nucleotide sequence comprising or consisting of 
the nucleotide sequence of SEQ IDNO:l, 11, 13, 16, 240, 737, 1108, 1590 and/or 1965, or 
a fragment thereof. In another embodiment, the invention provides vaccine preparations 

25 comprising one or more polypeptides of the invention encoded by a nucleotide sequence 
comprising or consisting of the nucleotide sequence of SEQ ID NO: 15, or a fragment 
thereof. Furthermore, the present invention provides methods for treating, ameliorating, 
managing, or preventing S ARS by administering the vaccine preparations or antibodies of 
the present invention alone or in combination with antivirals [e.g., amantadine, rimantadine, 

30 gancyclovir, acyclovir, ribavirin, penciclovir, oseltamivir, foscarnet zidovudine (AZT), 
didanosine (ddl), lamivudine (3TC), zalcitabine (ddC) ? stavudine (d4T), nevirapine, 
delavirdine, indinavir, ritonavir, vidarabine, nelfinavir, saquinavir, relenza, tamiflu, 
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pleconaril, interferons, etc.], steroids and corticosteroids such as prednisone, cortisone, 
fluticasone and glucocorticoid, antibiotics, analgesics, broncho dialaters, or other treatments 
for respiratory and/or viral infections. 

Furthermore, the present invention provides pharmaceutical compositions 
5 comprising anti- viral agents of the present invention and a pharmaceutically acceptable 
carrier. The present invention also provides kits comprising pharmaceutical compositions 
of the present invention. 

In another aspect, the present invention provides methods for screening anti-viral 
agents that inhibit the infectivity or replication of hSARS virus or variants thereof. 

10 

5.1 Recombinant and Chimeric hSARS Viruses 

The present invention encompasses recombinant or chimeric viruses encoded by 
viral vectors derived from the genome of hS ARS virus or natural variants thereof. In a 
specific embodiment, a recombinant virus is one derived from the hS ARS virus of deposit 

1 5 accession no. CCTCC- V2003 03 . In a specific embodiment, the virus has a nucleotide 

sequence of SEQ ID NO: 15. In another specific embodiment, a recombinant virus is one 
derived from a natural variant of hSARS virus. A natural variant of hSARS has a sequence 
that is different from the genomic sequence (SEQ ID NO: 15) of the hSARS virus, CCTCC- 
V200303, due to one or more naturally occurred mutations, including, but not limited to, 

20 point mutations, rearrangements, insertions, deletions etc., to the genomic sequence that 
may or may not result in a phenotypic change. In accordance with the present invention, a 
viral vector which is derived from the genome of the hSARS virus, CCTCC- V2003 03, is 
one that contains a nucleic acid sequence that encodes at least a part of one ORF of the 
hSARS virus. In a specific embodiment, the ORF comprises or consists of a nucleotide 

25 sequence of SEQ ID NO: 1, 1 1 or 13, or a fragment thereof. In a specific embodiment, there 
are more than one ORF within the nucleotide sequence of SEQ ID NO: 15 or a complement 
thereof, as shown in Figures 1 1 (SEQ ID NOS: 16, 240 and 737) and 12 (SEQ ID NOS: 1 108, 
1590 and 1965), or a fragment thereof. In another embodiment, the polypeptide encoded by 
the ORF comprises or consists of an amino acid sequence of SEQ ID NO: 2, 12, or 14, or a 

30 fragment thereof, or shown in Figures 1 1 (SEQ ID NOS: 17-239, 241-736 and 738-1 107) and 
12(SEQIDNOS:1109-1589, 1591-1964 and 1966-2470), or a fragment thereof. In 
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accordance with the present invention these viral vectors may or may not include nucleic 

acids that are non-native to the viral genome. 

In another specific embodiment, a chimeric virus of the invention is a recombinant 

hSARS virus which further comprises a heterologous nucleotide sequence. In accordance 

5 with the invention, a chimeric virus may be encoded by a nucleotide sequence in which 

heterologous nucleotide sequences have been added to the genome or in which endogenous 

or native nucleotide sequences have been replaced with heterologous nucleotide sequences. 
According to the present invention, the chimeric viruses are encoded by the viral 

vectors of the invention which further comprise a heterologous nucleotide sequence. In 

10 accordance with the present invention a chimeric virus is encoded by a viral vector that may 
or may not include nucleic acids that are non-native to the viral genome. In accordance 
with the invention a chimeric virus is encoded by a viral vector to which heterologous 
nucleotide sequences have been added, inserted or substituted for native or non-native 
sequences. In accordance with the present invention, the chimeric virus may be encoded by 

15 nucleotide sequences derived from different strains or variants of hSARS virus. In 

particular, the chimeric virus is encoded by nucleotide sequences that encode antigenic 
polypeptides derived from different strains or variants of hSARS virus, 

A chimeric virus may be of particular use for the generation of recombinant vaccines 
protecting against two or more viruses (Tao et al, J. Virol. 72, 2955-2961; Durbin et al., 

20 2000, J.Virol. 74, 6821-6831; Skiadopoulos et al., 1998, J. Virol. 72, 1762-1768 (1998); 
Teng et al., 2000, J.Virol. 74, 9317-9321). For example, it can be envisaged that a virus 
vector derived from the hSARS virus expressing one or more proteins of variants of hSARS 
virus, or vice versa, will protect a subject vaccinated with such vector against infections by 
both the native hSARS and the variant. Attenuated and replication-defective viruses may be 

25 of use for vaccination purposes with live vaccines as has been suggested for other viruses. 
{See, PCT WO 02/057302, at pp. 6 and 23, incorporated by reference herein). 

In accordance with the present invention the heterologous sequence to be 
incorporated into the viral vectors encoding the recombinant or chimeric viruses of the 
invention include sequences obtained or derived from different strains or variants of hSARS. 

30 In certain embodiments, the chimeric or recombinant viruses of the invention are 

encoded by viral vectors derived from viral genomes wherein one or more sequences, 
intergenic regions, termini sequences, or portions or entire ORF have been substituted with 
a heterologous or non-native sequence. In certain embodiments of the invention, the 
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chimeric viruses of the invention are encoded by viral vectors derived from viral genomes 
wherein one or more heterologous sequences have been inserted or added to the vector. 

The selection of the viral vector may depend on the species of the subject that is to 
be treated or protected from a viral infection. If the subject is human, then an attenuated 
5 hSARS virus can be used to provide the antigenic sequences. 

In accordance with the present invention, the viral vectors can be engineered to 
provide antigenic sequences which confer protection against infection by the hS ARS and 
natural variants thereof. The viral vectors may be engineered to provide one, two, three or 
more antigenic sequences. In accordance with the present invention the antigenic sequences 
10 may be derived from the same virus, from different strains or variants of the same type of 
virus, or from different viruses. 

The expression products and/or recombinant or chimeric virions obtained in 
accordance with the invention may advantageously be utilized in vaccine formulations. The 
expression products and chimeric virions of the present invention may be engineered to 
1 5 create vaccines against a broad range of pathogens, including viral and bacterial antigens, 
tumor antigens, allergen antigens, and auto antigens involved in autoimmune disorders. In 
particular, the chimeric virions of the present invention may be engineered to create 
vaccines for the protection of a subject from infections with hSARS virus and variants 
thereof, 

20 In certain embodiments, the expression products and recombinant or chimeric 

virions of the present invention may be engineered to create vaccines against a broad range 
of pathogens, including viral antigens, tumor antigens and autoantigens involved in 
autoimmune disorders. One way to achieve this goal involves modifying existing hSARS 
genes to contain foreign sequences in their respective external domains. Where the 

25 heterologous sequences are epitopes or antigens of pathogens, these chimeric viruses may 
be used to induce a protective immune response against the disease agent from which these 
determinants are derived. 

Thus, the present invention relates to the use of viral vectors and recombinant or 
chimeric viruses to formulate vaccines against a broad range of viruses and/or antigens. 

30 The present invention also encompasses recombinant viruses comprising a viral vector 
derived from the hS ARS or variants thereof which contains sequences which result in a 
virus having a phenotype more suitable for use in vaccine formulations, e.g., attenuated 
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phenotype or enhanced antigenicity. The mutations and modifications can be in coding 
regions, in intergenic regions and in the leader and trailer sequences of the virus. 

The invention provides a host cell comprising a nucleic acid or a vector according to 
the invention. Plasmid or viral vectors containing the polymerase components of hSARS 
5 virus are generated in prokaryotic cells for the expression of the components in relevant cell 
types (bacteria, insect cells, eukaryotic cells). Plasmid or viral vectors containing 
full-length or partial copies of the hSARS genome will be generated in prokaryotic cells for 
the expression of viral nucleic acids in-vitro or in-vivo. The latter vectors may contain 
other viral sequences for the generation of chimeric viruses or chimeric virus proteins, may 
10 lack parts of the viral genome for the generation of replication defective virus, and may 
contain mutations, deletions or insertions for the generation of attenuated viruses. In 
addition, the present invention provides a host cell infected with hS ARS virus, for example, 
of deposit no. CCTCC-V200303. 

Infectious copies of hSARS (being wild type, attenuated, replication-defective or 
1 5 chimeric) can be produced upon co-expression of the polymerase components according to 
the state-of-the-art technologies described above. 

In addition, eukaryotic cells, transiently or stably expressing one or more full-length 
or partial hSARS proteins can be used. Such cells can be made by transfection (proteins or 
nucleic acid vectors), infection (viral vectors) or transduction (viral vectors) and may be 
20 useful for complementation of mentioned wild type, attenuated, replication-defective or 
chimeric viruses. 

The viral vectors and chimeric viruses of the present invention may be used to 
modulate a subject's immune system by stimulating a humoral immune response, a cellular 
immune response or by stimulating tolerance to an antigen. As used herein, a subject means: 
25 humans, primates, horses, cows, sheep, pigs, goats, dogs, cats, avian species and rodents. 

5.2 Formulation of Vaccines and Antivirals 

In a preferred embodiment, the invention provides a proteinaceous molecule or 
hSARS virus specific viral protein or functional fragment thereof encoded by a nucleic acid 
30 according to the invention. Useful proteinaceous molecules are for example derived from 
any of the genes or genomic fragments derivable from the virus according to the invention, 
including envelop protein (E protein), integral membrane protein (M protein), spike protein 
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(S protein), nucleocapsid protein (N protein), hemaglutinin esterase (HE protein), and RNA- 
dependent RNA polymerase. Such molecules, or antigenic fragments thereof, as provided 
herein, are for example useful in diagnostic methods or kits and in pharmaceutical 
compositions such as subunit vaccines. Particularly useful are polypeptides encoded by the 
5 nucleotide sequence of SEQ ID NO: 1, 11, 13, or 1 5, or as shown in Figures 1 1 (SEQ ID 
NOS: 17-239, 241-736 and 738-1 107) and 12 (SEQ ID NOS: 1109-1589, 1591-1964 and 
1966-2470), or antigenic fragments thereof for inclusion as antigen or subunit immunogen, 
but inactivated whole virus can also be used. Particularly useful are also those 
proteinaceous substances that are encoded by recombinant nucleic acid fragments of the 

1 0 hS ARS genome, of course preferred are those that are within the preferred bounds and 
metes of ORFs, in particular, for eliciting hSARS specific antibody or T cell responses, 
whether in vivo (e.g. for protective or therapeutic purposes or for providing diagnostic 
antibodies) or in vitro (e.g. by phage display technology or another technique useful for 
generating synthetic antibodies). 

1 5 The invention provides vaccine formulations for the prevention and treatment of 

infections with hSARS virus. In certain embodiments^, the vaccine of the invention 
comprises recombinant and chimeric viruses of the hSARS virus. In certain embodiments, 
the virus is attenuated. 

In another embodiment of this aspect of the invention, inactivated vaccine 

20 formulations may be prepared using conventional techniques to "kill" the chimeric viruses. 
Inactivated vaccines are Mead" in the sense that their infectivity has been destroyed. 
Ideally, the infectivity of the virus is destroyed without affecting its immunogenicity. In 
order to prepare inactivated vaccines, the chimeric virus may be grown in cell culture or in 
the allantois of the chick embryo, purified by zonal ultracentrifugation, inactivated by 

25 formaldehyde or p-propiolactone, and pooled. The resulting vaccine is usually inoculated 
intramuscularly. 

Inactivated viruses may be formulated with a suitable adjuvant in order to enhance 
the immunological response. Such adjuvants may include but are not limited to mineral 
gels, e.g., aluminum hydroxide; surface active substances such as lysolecithin, pluronic 
30 polyols, polyanions; peptides; oil emulsions; and potentially useful human adjuvants such as 
BCG and Corynebacterium parvum. 
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In another aspect, the present invention also provides DNA vaccine formulations 
comprising a nucleic acid or fragment of the hSARS virus, e.g., the virus having accession 
no. CCTCC-V200303, or nucleic acid molecules having the sequence of SEQ ID NO:l, 11, 
13, or 15, or a fragment thereof. In another specific embodiment, the DNA vaccine 
5 formulations of the present invention comprises a nucleic acid or fragment thereof encoding 
the antibodies which immunospecifically binds hSARS viruses. In DNA vaccine 
formulations, a vaccine DNA comprises a viral vector, such as that derived from the hS ARS 
virus, bacterial plasmid, or other expression vector, bearing an insert comprising a nucleic 
acid molecule of the present invention operably linked to one or more control elements, 

10 thereby allowing expression of the vaccinating proteins encoded by said nucleic acid 
molecule in a vaccinated subject. Such vectors can be prepared by recombinant DNA 
technology as recombinant or chimeric viral vectors carrying a nucleic acid molecule of the 
present invention (see also Section 5.1, supra). 

Various heterologous vectors are described for DNA vaccinations against viral 

1 5 infections. For example, the vectors described in the following references may be used to 
express hS ARS sequences instead of the sequences of the viruses or other pathogens 
described; in particular, vectors described for hepatitis B virus (Michel, M.L. et al, 1995, 
DAN-mediated immunization to the hepatitis B surface antigen in mice: Aspects of the 
humoral response mimic hepatitis B viral infection in humans, Proc. Natl Aca. Sci. USA 

20 92:5307-53 11; Davis, H.L. et al, 1993, DNA-based immunization induces continuous 

seretion of hepatitis B surface antigen and high levels of circulating antibody, Human Molec. 
Genetics 2:1847-1851), HIV virus (Wang, B. etal, 1993, Gene inoculation generates 
immune responses against human imunodeficiency virus type 1, Proc. Natl Acad Sci. USA 
90:4156-4160; Lu, S, etal, 1996, Simian immunodeficiency virus DNA vaccine trial in 

25 macques, J. Virol 70:3978-3991; Letvin, NX. et al, 1997, Potent, protective anti-HIV 

immune responses generated by bimodal HIV envelope DNA plus protein vaccination, Proc 
Natl Acad Sci USA. 94(17):9378-83), and influenza viruses (Robinson, HL etal, 1993, 
Protection against a lethal influenza virus challenge by immunization with a 
haemagglutinin-expressing plasmid DNA, Vaccine 11:957-960; Ulmer, J.B. etal, 

30 Heterologous protection against influenza by injection of DNA encoding a viral protein, 
Science 259:1745-1749), as well as bacterial infections, such as tuberculosis (Tascon, R.E. 
et al, 1996, Vaccination against tuberculosis by DNA injection, Nature Med. 2:888-892; 
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Huygen, K. et al 7 1996, Immunogenicity and protective efficacy of a tuberculosis DNA 
vaccine, Nature Med., 2:893-898), and parasitic infection, such as malaria (Sedegah, M., 
1994, Protection against malaria by immunization with plasmid DNA encoding 
circumsporozoite protein, Proc. Natl Acad. Sci. USA 91:9866-9870; Doolan, D.L. etal, 
5 1996, Circumventing genetic restriction of protection against malaria with multigene DNA 
immunization: CD 8+ T cell-interferon 5, and nitric oxide-dependent immunity, J. Exper. 
Med, 1183:1739-1746). 

Many methods may be used to introduce the vaccine formulations described above. 
These include, but are not limited to, oral, intradermal, intramuscular, intraperitoneal, 

10 intravenous, subcutaneous, and intranasal routes. Alternatively, it may be preferable to 
introduce the chimeric virus vaccine formulation via the natural route of infection of the 
pathogen for which the vaccine is designed. The DNA vaccines of the present invention 
may be administered in saline solutions by injections into muscle or skin using a syringe 
and needle (Wolff J. A. et al, 1990, Direct gene transfer into mouse muscle in vivo, Science 

15 247:1465-1468; Raz, E., 1994, Intradermal gene immunization: The possible role of DNA 
uptake in the induction of cellular immunity to viruses, Proc. Natl Acd. Sci. USA 91 :9519- 
9523). Another way to administer DNA vaccines is called "gene gun" method, whereby 
microscopic gold beads coated with the DNA molecules of interest is fired into the cells 
(Tang, D. et al 9 1992, Genetic immunization is a simple method for eliciting an immune 

20 response, Nature 356: 152-154). For general reviews of the methods for DNA vaccines, see 
Robinson, H.L., 1999, DNA vaccines: basic mechanism and immune responses (Review), 
Int. J. Mol Med. 4(5):549-555; Barber, B., 1997, Introduction: Emerging vaccine strategies, 
Seminars in Immunology 9(5):269-270; and Robinson, H.L. et al 9 1997, DNA vaccines, 
Seminars in Immunology 9(5) :27 1-283. 

25 

5.3 Attenuation of hSARS Virus or Variants Thereof 

The hS ARS virus or variants thereof of the invention can be genetically engineered 
to exhibit an attenuated phenotype. In particular, the viruses of the invention exhibit an 
attenuated phenotype in a subject to which the virus is administered as a vaccine. 
30 Attenuation can be achieved by any method known to a skilled artisan. Without being 

bound by theory, the attenuated phenotype of the viruses of the invention can be caused, e.g., 
by using a virus that naturally does not replicate well in an intended host species, for 
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example, by reduced replication of the viral genome, by reduced ability of the virus to infect 

a host cell, or by reduced ability of the viral proteins to assemble to an infectious viral 

particle relative to the wild type strain of the virus. 

The attenuated phenotypes of hSARS virus or variants thereof can be tested by any 

5 method known to the artisan. A candidate virus can, for example, be tested for its ability to 

infect a host or for the rate of replication in a cell culture system. In certain embodiments, 

growth curves at different temperatures are used to test the attenuated phenotype of the 

virus. For example, an attenuated virus is able to grow at 35°C, but not at 39°C or 40°C. In 

certain embodiments, different cell lines can be used to evaluate the attenuated phenotype of 

10 the virus. For example, an attenuated virus may only be able to grow in monkey cell lines 
but not the human cell lines, or the achievable virus titers in different cell lines are different 
for the attenuated virus. In certain embodiments, viral replication in the respiratory tract of 
a small animal model, including but not limited to, hamsters, cotton rats, mice and guinea 
pigs, is used to evaluate the attenuated phenotypes of the virus. In other embodiments, the 

15 immune response induced by the virus, including but not limited to, the antibody titers (e.g., 
assayed by plaque reduction neutralization assay or ELISA) is used to evaluate the 
attenuated phenotypes of the virus. In a specific embodiment, the plaque reduction 
neutralization assay or ELISA is carried out at a low dose. In certain embodiments, the 
ability of the hSARS virus to elicit pathological symptoms in an animal model can be tested. 

20 A reduced ability of the virus to elicit pathological symptoms in an animal model system is 
indicative of its attenuated phenotype. In a specific embodiment, the candidate viruses are 
tested in a monkey model for nasal infection, indicated by mucous production. 

The viruses of the invention can be attenuated such that one or more of the 
functional characteristics of the virus are impaired. In certain embodiments, attenuation is 

25 measured in comparison to the wild type strain of the virus from which the attenuated virus 
is derived. In other embodiments, attenuation is determined by comparing the growth of an 
attenuated virus in different host systems. Thus, for a non-limiting example, hS ARS virus 
or a variant thereof is said to be attenuated when grown in a human host if the growth of the 
hS ARS or variant thereof in the human host is reduced compared to the non-attenuated 

3 0 hS ARS or variant thereof 

In certain embodiments, the attenuated virus of the invention is capable of infecting 
a host, is capable of replicating in a host such that infectious viral particles are produced. In 
comparison to the wild type strain, however, the attenuated strain grows to lower titers or 
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grows more slowly. Any technique known to the skilled artisan can be used to determine 
the growth curve of the attenuated virus and compare it to the growth curve of the wild type 
virus. 

In certain embodiments, the attenuated virus of the invention (e.g., a recombinant or 
5 chimeric hS ARS) cannot replicate in human cells as well as the wild type virus (e.g., wild 
type hSARS) does. However, the attenuated virus can replicate well in a cell line that lack 
interferon functions, such as Vero cells. 

In other embodiments, the attenuated virus of the invention is capable of infecting a 
host, of replicating in the host, and of causing proteins of the virus of the invention to be 

10 inserted into the cytoplasmic membrane, but the attenuated virus does not cause the host to 
produce new infectious viral particles. In certain embodiments, the attenuated virus infects 
the host, replicates in the host, and causes viral proteins to be inserted in the cytoplasmic 
membrane of the host with the same efficiency as the wild type hSARS. In other 
embodiments, the ability of the attenuated virus to cause viral proteins to be inserted into 

15 the cytoplasmic membrane into the host cell is reduced compared to the wild type virus. In 
certain embodiments, the ability of the attenuated hSARS virus to replicate in the host is 
reduced compared to the wild type virus. Any technique known to the skilled artisan can be 
used to determine whether a virus is capable of infecting a mammalian cell, of replicating 
within the host, and of causing viral proteins to be inserted into the cytoplasmic membrane 

20 of the host. 

In certain embodiments, the attenuated virus of the invention is capable of infecting 
a host. In contrast to the wild type hSARS, however, the attenuated hSARS cannot be 
replicated in the host. In a specific embodiment, the attenuated hSARS virus can infect a 
host and can cause the host to insert viral proteins in its cytoplasmic membranes, but the 
25 attenuated virus is incapable of being replicated in the host. Any method known to the 

skilled artisan can be used to test whether the attenuated hSARS has infected the host and 
has caused the host to insert viral proteins in its cytoplasmic membranes. 

In certain embodiments, the ability of the attenuated virus to infect a host is reduced 
compared to the ability of the wild type virus to infect the same host. Any technique known 
30 to the skilled artisan can be used to determine whether a virus is capable of infecting a host. 

In certain embodiments, mutations (e.g., missense mutations) are introduced into the 
genome of the virus, for example, into the sequence of SEQ ID NO: 1, 1 1, 13, or 15, or to 
generate a virus with an attenuated phenotype. Mutations (e.g., missense mutations) can be 
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introduced into the structural genes and/or regulatory genes of the hSARS. Mutations can 
be additions, substitutions, deletions, or combinations thereof Such variant of hSARS can 
be screened for a predicted functionality, such as infectivity, replication ability, protein 
synthesis ability, assembling ability, as well as cytopathic effect in cell cultures. In a 
5 specific embodiment, the missense mutation is a cold-sensitive mutation. In another 

embodiment, the missense mutation is a heat-sensitive mutation. In another embodiment, 
the missense mutation prevents a normal processing or cleavage of the viral proteins. 

In other embodiments, deletions are introduced into the genome of the hSARS virus, 
which result in the attenuation of the virus. 

10 In certain embodiments, attenuation of the virus is achieved by replacing a gene of 

the wild type virus with a gene of a virus of a different species, of a different subgroup, or 
of a different variant. In another aspect, attenuation of the virus is achieved by replacing 
one or more specific domains of a protein of the wild type virus with domains derived from 
the corresponding protein of a virus of a different species. In certain other embodiments, 

15 attenuation of the virus is achieved by deleting one or more specific domains of a protein of 
the wild type virus. 

When a live attenuated vaccine is used, its safety must also be considered. The 
vaccine must not cause disease. Any techniques known in the art that can make a vaccine 
safe may be used in the present invention. In addition to attenuation techniques, other 

20 techniques may be used. One non-limiting example is to use a soluble heterologous gene 
that cannot be incorporated into the virion membrane. For example, a single copy of the 
soluble version of a viral transmembrane protein lacking the transmembrane and cytosolic 
domains thereof, can be used. 

Various assays can be used to test the safety of a vaccine. For example, sucrose 

25 gradients and neutralization assays can be used to test the safety. A sucrose gradient assay 
can be used to determine whether a heterologous protein is inserted in a virion. If the 
heterologous protein is inserted in the virion, the virion should be tested for its ability to 
cause symptoms in an appropriate animal model since the virus may have acquired new, 
possibly pathological, properties. 

30 

5.4 Adjuvants and Carrier Molecules 
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hSARS-associated antigens are administered with one or more adjuvants. In one 
embodiment, the hSARS-associated antigen is administered together with a mineral salt 
adjuvants or mineral salt gel adjuvant. Such mineral salt and mineral salt gel adjuvants 
include, but are not limited to, aluminum hydroxide (ALHYDROGEL, REHYDRAGEL), 
5 aluminum phosphate gel, aluminum hydroxyphosphate (ADJU-PHOS), and calcium 
phosphate. 

In another embodiment, hSARS-associated antigen is administered with an 
immuno stimulatory adjuvant. Such class of adjuvants, include, but are not limited to, 
cytokines (e.g., interleukin-2, interleukin-7, interleukin-12, granulocyte-macrophage colony 

10 stimulating factor (GM-CSF), interfereon-y interleukin-lp (DL-ip), and EL- 1(3 peptide or 
Sclavo Peptide), cytokine-containing liposomes, triterpenoid glycosides or saponins (e.g., 
QuilA and QS-21, also sold under the trademark ST1MULON, ISCOPREP), Muramyi 
Dipeptide (MDP) derivatives, such as N-acetyl-muramyl-L-threonyl-D-isoglutamine 
(Threonyl-MDP, sold under the trademark TERMURTEDE), GMDP, N-acetyl-nor- 

15 muramyl-L-alanyl-D-isoglutamine, N-acetylmuramyl-L-alanyl-D-isoglutaminyl-L-alanine- 
2-(r-2'-dipalmitoyl-sn-glycero-3 -hydroxy phosphoryloxy)-ethylamine, muramyi tripeptide 
phosphatidylethanolamine (MTP-PE), unmethylated CpG dinucleotides and 
oligonucleotides, such as bacterial DNA and fragments thereof, LPS, monophosphoryl 
Lipid A (3D -ML A sold under the trademark MPL), and polyphosphazenes. 

20 In another embodiment, the adjuvant used is a particular adjuvant, including, but not 

limited to, emulsions, e.g, Freund's Complete Adjuvant, Freund's Incomplete Adjuvant, 
squalene or squalane oil-in-water adjuvant formulations, such as SAF and MF59, e.g., 
prepared with block-copolymers, suchasL-121 (polyoxypropylene/polyoxyetheylene) sold 
under the trademark PLURONIC L-121, Liposomes, Virosomes, cochleates, and immune 

25 stimulating complex, which is sold under the trademark ISCOM. 

In another embodment, a microp articular adjuvant is used., Microparticulare 
adjuvants include, but are not limited to biodegradable and biocompatible polyesters, homo- 
and copolymers of lactic acid (PLA) and glycolic acid (PGA), poly(lactide-co-glycolides) 
(PLGA) microparticles, polymers that self-associate into particulates (poloxamer particles), 

30 soluble polymers (polyphosphazenes), and virus-like particles (VLPs) such as recombinant 
protein particulates, e.g., hepatitis B surface antigen (HbsAg). 
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Yet another class of adjuvants that may be used include mucosal adjuvants, 
including but not limited to heat-labile enterotoxin from Escherichia coli (LT), cholera 
holotoxin (CT) and cholera Toxin B Subunit (CTB) from Vibrio cholerae, mutant toxins 
(e.g., LTK63 and LTR72), microparticles, and polymerized liposomes. 
5 In other embodiments, any of the above classes of adjuvants may be used in 

combination with each other or with other adjuvants. For example, non-limiting examples 
of combination adjuvant preparations that can be used to administer the hSARS-associated 
antigens of the invention include liposomes containing immuno stimulatory protein, 
cytokines, or T-cell and/or B-cell peptides, or microbes with or without entrapped IL-2 or 

10 microp articles containing enterotoxin. Other adjuvants known in the art are also included 
within the scope of the invention (see Vaccine Design: The Subunit and Adjuvant Approach, 
Chap. 7, Michael F. Powell and Mark J. Newman (eds.), Plenum Press, New York, 1995, 
which is incorporated herein in its entirety). 

The effectiveness of an adjuvant may be determined by measuring the induction of 

15 antibodies directed against an immunogenic polypeptide containing a hSARS polypeptide 
epitope, the antibodies resulting from administration of this polypeptide in vaccines which 
are also comprised of the various adjuvants. 

The polypeptides may be formulated into the vaccine as neutral or salt forms. 
Pharmaceutically acceptable salts include the acid additional salts (formed with free amino 

20 groups of the peptide) and which are formed with inorganic acids, such as, for example, 
hydrochloric or phosphoric acids, or organic acids such as acetic, oxalic, tartaric, maleic, 
and the like. Salts formed with free carboxyl groups may also be derived from inorganic 
bases, such as, for example, sodium potassium, ammonium, calcium, or ferric hydroxides, 
and such organic bases as isopropylamine, trimethylamine, 2-ethylamino ethanol, histidine, 

25 procaine and the like. 

The vaccines of the invention may be multivalent or univalent. Multivalent vaccines 
are made from recombinant viruses that direct the expression of more than one antigen. 

Many methods may be used to introduce the vaccine formulations of the invention; 
these include but are not limited to oral, intradermal, intramuscular, intraperitoneal, 

30 intravenous, subcutaneous, intranasal routes, and via scarification (scratching through the 
top layers of skin, e.g., using a bifurcated needle). 
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The patient to which the vaccine is administered is preferably a mammal, most 
preferably a human, but can also be a non-human animal including but not limited to cows, 
horses, sheep, pigs, fowl (e.g., chickens), goats, cats, dogs, hamsters, mice and rats. 



5 5.5 Preparation of Antibodies 

Antibodies which specifically recognize a polypeptide of the invention, such as, but 
not limited to, polypeptides comprising the sequence of SEQ ID NO:2, 12, and 14, and 
polypeptides as shown in Figures 11 (SEQ ID NOS: 17-239, 241-736 and 738-1107) and 12 
(SEQ ID NOS:1109-1589, 1591-1964 and 1966-2470), or hSARS epitope or antigen- 

10 binding fragments thereof can be used for detecting, screening, and isolating the 

polypeptide of the invention or fragments thereof, or similar sequences that might encode 
similar enzymes from the other organisms. For example, in one specific embodiment, an 
antibody which immunospecifically binds hSARS epitope, or a fragment thereof, can be 
used for various in vitro detection assays, including enzyme-linked immunosorbent assays 

1 5 (ELIS A), radioimmunoassays, Western blot, etc., for the detection of a polypeptide of the 
invention or, preferably, hS ARS, in samples, for example, a biological material, including 
cells, cell culture media (e.g., bacterial cell culture media, mammalian cell culture media, 
insect cell culture media, yeast cell culture media, etc.), blood, plasma, serum, tissues, 
sputum, naseopharyngeal aspirates, etc. 

20 Antibodies specific for a polypeptide of the invention or any epitope of hSARS may 

be generated by any suitable method known in the art. Polyclonal antibodies to an antigen- 
of-interest, for example, the hSARS virus from deposit no. CCTCC-V200303, or comprises 
a nucleotide sequence of SEQ ID NO: 15, can be produced by various procedures well 
known in the art. For example, an antigen can be administered to various host animals 

25 including, but not limited to, rabbits, mice, rats, etc., to induce the production of antisera 

containing polyclonal antibodies specific for the antigen. Various adjuvants may be used to 
increase the immunological response, depending on the host species, and include but are not 
limited to, Freund's (complete and incomplete) adjuvant, mineral gels such as aluminum 
hydroxide, surface active substances such as lysolecithin, pluronic polyols, polyanions, 

30 peptides, oil emulsions, keyhole limpet hemocyanins, dinitrophenol, and potentially useful 
adjuvants for humans such as BCG (Bacille Calmette-Guerin) and Corynebacterium parvum. 
Such adjuvants are also well known in the art. 
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Monoclonal antibodies can be prepared using a wide variety of techniques known in 
the art including the use of hybridoma, recombinant, and phage display technologies, or a 
combination thereof. For example, monoclonal antibodies can be produced using 
hybridoma techniques including those known in the art and taught, for example, in Harlow 
5 et al., Antibodies: A Laboratory Manual, (Cold Spring Harbor Laboratory Press, 2nd ed. 
1988); Hammerling, et al, in: Monoclonal Antibodies and T-Cell Hybridomas, pp. 563-681 
(Elsevier, MY., 1981) (both of which are incorporated by reference in their entireties). The 
term "monoclonal antibody" as used herein is not limited to antibodies produced through 
hybridoma technology. The term "monoclonal antibody" refers to an antibody that is 

10 derived from a single clone, including any eukaryotic, prokaryotic, or phage clone, and not 
the method by which it is produced. 

Methods for producing and screening for specific antibodies using hybridoma 
technology are routine and well known in the art. In a non-limiting example, mice can be 
immunized with an antigen of interest or a cell expressing such an antigen. Once an 

15 immune response is detected, e.g., antibodies specific for the antigen are detected in the 

mouse serum, the mouse spleen is harvested and splenocytes isolated. The splenocytes are 
then fused by well known techniques to any suitable myeloma cells. Hybridomas are 
selected and cloned by limiting dilution. The hybridoma clones are then assayed by 
methods known in the art for cells that secrete antibodies capable of binding the antigen. 

20 Ascites fluid, which generally contains high levels of antibodies, can be generated by 
inoculating mice intraperitoneally with positive hybridoma clones. 

Antibody fragments which recognize specific epitopes may be generated by known 
techniques. For example, Fab and F(ab') 2 fragments may be produced by proteolytic 
cleavage of immunoglobulin molecules, using enzymes such as papain (to produce Fab 

25 fragments) or pepsin (to produce F(ab') 2 fragments). F(ab') 2 fragments contain the 

complete light chain, and the variable region, the CHI region and the hinge region of the 
heavy chain. 

The antibodies of the invention or fragments thereof can be also produced by any 
method known in the art for the synthesis of antibodies, in particular, by chemical synthesis 
30 or preferably, by recombinant expression techniques. 

The nucleotide sequence encoding an antibody may be obtained from any 
information available to those skilled in the art (i.e., from Genbank, the literature, or by 
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routine cloning and sequence analysis). If a clone containing a nucleic acid encoding a 
particular antibody or an epitope-binding fragment thereof is not available, but the sequence 
of the antibody molecule or epitope-binding fragment thereof is known, a nucleic acid 
encoding the immunoglobulin may be chemically synthesized or obtained from a suitable 
5 source (e.g., an antibody cDNA library, or a cDNA library generated from, or nucleic acid, 
preferably poly A+ RNA, isolated from any tissue or cells expressing the antibody, such as 
hybridoma cells selected to express an antibody) by PGR amplification using synthetic 
primers hybridizable to the 3' and 5' ends of the sequence or by cloning using an 
oligonucleotide probe specific for the particular gene sequence to identify, e.g., a cDNA 
10 clone from a cDNA library that encodes the antibody. Amplified nucleic acids generated by 
PGR may then be cloned into replicable cloning vectors using any method well known in 
the art. 

Once the nucleotide sequence of the antibody is determined, the nucleotide sequence 
of the antibody may be manipulated using methods well known in the art for the 

15 manipulation of nucleotide sequences, e.g., recombinant DNA techniques, site directed 

mutagenesis, PGR, etc. (see, for example, the techniques described in Sambrook et al., supra; 
and Ausubel et al., eds., 1998, Current Protocols in Molecular Biology, John Wiley & Sons, 
NY, which are both incorporated by reference herein in their entireties), to generate 
antibodies having a different amino acid sequence by, for example, introducing amino acid 

20 substitutions, deletions, and/or insertions into the epitope-binding domain regions of the 

antibodies or any portion of antibodies which may enhance or reduce biological activities of 
the antibodies. 

Recombinant expression of an antibody requires construction of an expression 
vector containing a nucleotide sequence that encodes the antibody. Once a nucleotide 

25 sequence encoding an antibody molecule or a heavy or light chain of an antibody, or portion 
thereof has been obtained, the vector for the production of the antibody molecule may be 
produced by recombinant DNA technology using techniques well known in the art as 
discussed in the previous sections. Methods which are well known to those skilled in the art 
can be used to construct expression vectors containing antibody coding sequences and 

30 appropriate transcriptional and translational control signals. These methods include, for 
example, in vitro recombinant DNA techniques, synthetic techniques, and in vivo genetic 
recombination. The nucleotide sequence encoding the heavy-chain variable region, light- 
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chain variable region, both the heavy-chain and light-chain variable regions, an epitope- 
binding fragment of the heavy- and/or light-chain variable region, or one or more 
complementarity determining regions (CDRs) of an antibody may be cloned into such a 
vector for expression. Thus-prepared expression vector can be then introduced into 
5 appropriate host cells for the expression of the antibody. Accordingly, the invention 
includes host cells containing a polynucleotide encoding an antibody specific for the 
polypeptides of the invention or fragments thereof. 

The host cell may be co-transfected with two expression vectors of the invention, the 
first vector encoding a heavy chain derived polypeptide and the second vector encoding a 

1 0 light chain derived polypeptide. The two vectors may contain identical selectable markers 
which enable equal expression of heavy and light chain polypeptides or different selectable 
markers to ensure maintenance of both plasmids. Alternatively, a single vector may be used 
which encodes, and is capable of expressing, both heavy and light chain polypeptides. In 
such situations, the light chain should be placed before the heavy chain to avoid an excess 

15 of toxic free heavy chain (Proudfoot, Nature, 322:52, 1986; and Kohler, Proc. Natl. Acad. 
Sci. USA, 77:2 197, 1980). The coding sequences for the heavy and light chains may 
comprise cDNA or genomic DNA. 

In another embodiment, antibodies can also be generated using various phage 
display methods known in the art. In phage display methods, functional antibody domains 

20 are displayed on the surface of phage particles which carry the polynucleotide sequences 
encoding them. In a particular embodiment, such phage can be utilized to display antigen 
binding domains, such as Fab and Fv or disulfide-bond stabilized Fv, expressed from a 
repertoire or combinatorial antibody library (e.g., human or murine). Phage expressing an 
antigen binding domain that binds the antigen of interest can be selected or identified with 

25 antigen, e.g., using labeled antigen or antigen bound or captured to a solid surface or bead. 
Phage used in these methods are typically filamentous phage, including fd and M13. The 
antigen binding domains are expressed as a recombinantly fused protein to either the phage 
gene III or gene VIII protein. Examples of phage display methods that can be used to make 
the immunoglobulins, or fragments thereof, of the present invention include those disclosed 

30 in Brinkman et al., J. Immunol. Methods, 182:41-50, 1995; Ames et al, J. Immunol. 

Methods, 184:177-186, 1995; Kettleborough et al, Eur. J. Immunol., 24:952-958, 1994; 
Persic et al., Gene, 187:9-18, 1997; Burton et al., Advances in Immunology, 57:191-280, 
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1994; PCT application No. PCT/GB91/01134; PCT publications WO 90/02809; WO 
91/10737; WO 92/01047; WO 92/18619; WO 93/11236; WO 95/15982; WO 95/20401; and 
U.S. Patent Nos. 5,698,426; 5,223,409; 5,403,484; 5,580,717; 5,427,908; 5,750,753; 
5,821,047; 5,571,698; 5,427,908; 5,516,637; 5,780,225; 5,658,727; 5,733,743 and 
5 5,969, 108; each of which is incorporated herein by reference in its entirety. 

As described in the above references, after phage selection, the antibody coding 
regions from the phage can be isolated and used to generate whole antibodies, including 
human antibodies, or any other desired fragments, and expressed in any desired host, 
including mammalian cells, insect cells, plant cells, yeast, and bacteria, e.g., as described in 

10 detail below. For example, techniques to recombinantly produce Fab, Fab' and F(abd)2 
fragments can also be employed using methods known in the art such as those disclosed in 
PCT publication WO 92/22324; Mullinax et al, BioTechniques, 12(6): 864-869, 1992; and 
Sawai et al, AJRI, 34:26-34, 1995; and Better et al., Science, 240:1041-1043, 1988 (each of 
which is incorporated by reference in its entirety). Examples of techniques which can be 

15 used to produce single-chain Fvs and antibodies include those described in U.S. Patent Nos. 
4,946,778 and 5,258,498; Huston et al, Methods in Enzymology, 203:46-88, 1991; Shu et 
al, PNAS, 90:7995-7999, 1993; and Skerraetal., Science, 240:1038-1040, 1988. 

Once an antibody molecule of the invention has been produced by any methods 
described above, it may then be purified by any method known in the art for purification of 

20 an immunoglobulin molecule, for example, by chromatography (e.g., ion exchange, affinity, 
particularly by affinity for the specific antigen after Protein A or Protein G purification, and 
sizing column chromatography), centrifugation, differential solubility, or by any other 
standard techniques for the purification of proteins. Further, the antibodies of the present 
invention or fragments thereof may be fused to heterologous polypeptide sequences 

25 described herein or otherwise known in the art to facilitate purification. 

For some uses, including in vivo use of antibodies in humans and in vitro detection 
assays, it may be preferable to use chimeric, humanized, or human antibodies. A chimeric 
antibody is a molecule in which different portions of the antibody are derived from different 
animal species, such as antibodies having a variable region derived from a murine 

30 monoclonal antibody and a constant region derived from a human immunoglobulin. 
Methods for producing chimeric antibodies are known in the art. See e.g., Morrison, 
Science, 229:1202, 1985; Oi et al., BioTechniques, 4:214 1986; Gillies et al, J. Immunol. 
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Methods, 125:191-202, 1989; U.S. Patent Nos. 5,807,715; 4,816,567; and 4,816,397, which 
are incorporated herein by reference in their entireties. Humanized antibodies are antibody 
molecules from non-human species that bind the desired antigen having one or more 
complementarity determining regions (CDRs) from the non-human species and framework 
5 regions from a human immunoglobulin molecule. Often, framework residues in the human 
framework regions will be substituted with the corresponding residue from the CDR donor 
antibody to alter, preferably improve, antigen binding. These framework substitutions are 
identified by methods well known in the art, e.g., by modeling of the interactions of the 
CDR and framework residues to identify framework residues important for antigen binding 

10 and sequence comparison to identify unusual framework residues at particular positions. 

See, e.g., Queen et al., U.S. Patent No. 5,585,089; Riechmann et al., Nature, 332:323, 1988, 
which are incorporated herein by reference in their entireties. Antibodies can be humanized 
using a variety of techniques known in the art including, for example, CDR-grafting (EP 
239,400; PCT publication WO 91/09967; U.S. Patent Nos. 5,225,539; 5,530,101 and 

15 5,585,089), veneering or resurfacing (EP 592,106; EP 519,596; Padlan, Molecular 

Immunology, 28(4/5):489-498, 1991; Studnicka et al, Protein Engineering, 7(6):805-814, 
1994; Roguska et al., Proc Natl. Acad. Sci. USA, 91:969-973, 1994), and chain shuffling 
(U.S. Patent No. 5,565,332), all of which are hereby incorporated by reference in their 
entireties. 

20 Completely human antibodies are particularly desirable for therapeutic treatment of 

human patients. Human antibodies can be made by a variety of methods known in the art 
including phage display methods described above using antibody libraries derived from 
human immunoglobulin sequences. See U.S. Patent Nos. 4,444,887 and 4,716, 111; and 
PCT publications WO 98/46645; WO 98/50433; WO 98/24893; WO 98/16654; WO 

25 96/34096; WO 96/33735; and WO 91/10741, each of which is incorporated herein by 
reference in its entirety. 

Human antibodies can also be produced using transgenic mice which are incapable 
of expressing functional endogenous immunoglobulins, but which can express human 
immunoglobulin genes. For an overview of this technology for producing human 

30 antibodies, see Lonberg and Huszar, Int. Rev. Immunol., 13:65-93, 1995. For a detailed 
discussion of this technology for producing human antibodies and human monoclonal 
antibodies and protocols for producing such antibodies, see, e.g., PCT publications WO 
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98/24893; WO 92/01047; WO 96/34096; WO 96/33735; European Patent No. 0 598 877; 
U.S. Patent Nos. 5,413,923; 5,625,126; 5,633,425; 5,569,825; 5,661,016; 5,545,806; 
5,814,3 18; 5,885,793; 5,916,771; and 5,939,598, which are incorporated by reference herein 
in their entireties. In addition^ companies such as Abgenix, Inc. (Fremont, CA), Medarex 
5 (NJ) and Genpharm (San Jose, CA) can be engaged to provide human antibodies directed 
against a selected antigen using technology similar to that described above. 

Completely human antibodies which recognize a selected epitope can be generated 
using a technique referred to as "guided selection." In this approach a selected non-human 
monoclonal antibody, e.g., a mouse antibody, is used to guide the selection of a completely 
10 human antibody recognizing the same epitope. (Jespers et al, Bio/technology, 12:899-903, 
1988). 

Antibodies fused or conjugated to heterologous polypeptides may be used in in vitro 
immunoassays and in purification methods (e.g., affinity chromatography) well known in 
the art. See e.g., PCT publication Number WO 93/21232; EP 439,095; Naramura et al., 

15 Immunol. Lett, 39:91-99, 1994; U.S. Patent 5,474,981; Gillies et al., PNAS, 89:1428-1432, 
1992; and Fell et al., J. Immunol., 146:2446-2452, 1991, which are incorporated herein by 
reference in their entireties. 

Antibodies may also be attached to solid supports, which are particularly useful for 
immunoassays or purification of the polypeptides of the invention or fragments, derivatives, 

20 analogs, or variants thereof, or similar molecules having the similar enzymatic activities as 
the polypeptide of the invention. Such solid supports include, but are not limited to, glass, 
cellulose, polyacrylamide, nylon, polystyrene, polyvinyl chloride or polypropylene. 

5.6 Pharmaceutical Compositions and Kits 

25 The present invention encompasses pharmaceutical compositions comprising anti- 

viral agents of the present invention. In a specific embodiment, the anti-viral agent is an 
antibody which immunospecifically binds and neutralize the hS ARS virus or variants 
thereof, or any proteins derived therefrom (see Section 5.5). In another specific 
embodiment, the anti-viral agent is a polypeptide or nucleic acid molecule of the invention 

30 (see, for example, Sections 5. 1 and 5.2). The pharmaceutical compositions have utility as 
an anti-viral prophylactic agent and may be administered to a subject where the subject has 
been exposed or is expected to be exposed to a virus. 
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Various delivery systems are known and can be used to administer the 
pharmaceutical composition of the invention, e.g., encapsulation in liposomes, 
microparticles, microcapsules, recombinant cells capable of expressing the mutant viruses, 
receptor mediated endocytosis (see, e.g., Wu and Wu, 1987, J. Biol. Chem. 262:4429 4432). 
5 Methods of introduction include but are not limited to intradermal, intramuscular, 
intraperitoneal, intravenous, subcutaneous, intranasal, epidural, and oral routes. The 
compounds may be administered by any convenient route, for example by infusion or bolus 
injection, by absorption through epithelial or mucocutaneous linings (e.g., oral mucosa, 
rectal and intestinal mucosa, etc.) and may be administered together with other biologically 

10 active agents. Administration can be systemic or local. In a preferred embodiment, it may 
be desirable to introduce the pharmaceutical compositions of the invention into the lungs by 
any suitable route. Pulmonary administration can also be employed, e.g., by use of an 
inhaler or nebulizer, and formulation with an aerosolizing agent. 

In a specific embodiment, it may be desirable to administer the pharmaceutical 

15 compositions of the invention locally to the area in need of treatment; this may be achieved 
by, for example, and not by way of limitation, local infusion during surgery, topical application, 
e.g., in contraction with a wound dressing after surgery, by injection, by means of a catheter, by 
means of a suppository, by means of nasal spray, or by means of an implant, said implant being of a 
porous, nan porous, or gelatinous material, including membranes, such as sialastic membranes, or 

20 fibers. In one embodiment, administration can be by direct injection at the site (or former site) 
infected tissues. 

In another embodiment, the pharmaceutical composition can be delivered in a 
vesicle, in particular a liposome (see Langer, 1990, Science 249:1527-1533; Treat et al., in 
Liposomes in the Therapy of Infectious Disease and Cancer, Lopez Berestein and Fidler (eds.), Liss, 

25 New York, pp. 353-365 (1989); Lopez-Berestein, ibid. , pp. 3 17-327; see generally ibid.). 

In yet another embodiment, the pharmaceutical composition can be delivered in a 
controlled release system. In one embodiment, a pump may be used (see Langer, supra; 
Sefton, 1987, CRC Crit. Ref. Biomed. Eng. 14:201; Buchwald et al.,1980, Surgery 88:507; and 
Saudek et al., 1989, N. Engl. J. Med. 321:574). In another embodiment, polymeric materials can be 

3 0 used (see Medical Applications of Controlled Release, Langer and Wise (eds.), CRC Pres., Boca 
Raton, Florida (1974); Controlled Drug Bioavailability, Drug Product Design and Performance, 
Smolen and Ball (eds.), Wiley, New York (1984); Ranger and Peppas, J. Macromol. Sci. Rev. 
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Macromol. Chem. 23:61 (1983); see also Levy et aL, 1985, Science 228:190; During et aL, 1989, 
Ann. Neurol. 25:351; Howard et aL, 1989, J. Neurosurg. 71:105). In yet another embodiment, a 
controlled release system can be placed in proximity of the composition's target, i.e., the lung, thus 
requiring only a fraction of the systemic dose (see, e.g., Goodson, in Medical Applications of 
Controlled Release, supra, vol. 2, pp. 115-138 (1984)). 

Other controlled release systems are discussed in the review by Langer (Science 
249:1527-1533 (1990)). 

The pharmaceutical compositions of the present invention comprise a 
therapeutically effective amount of an live attenuated, inactivated or killed hSARS virus, or 
recombinant or chimeric hS ARS virus, and a pharmaceutically acceptable carrier. In a 
specific embodiment, the term "pharmaceutically acceptable" means approved by a 
regulatory agency of the Federal or a state government or listed in the U.S. Pharmacopeia or other 
generally recognized pharmacopeia for use in animals, and more particularly in humans. The term 
"carrier" refers to a diluent, adjuvant, excipient, or vehicle with which the pharmaceutical 
composition is administered. Such pharmaceutical carriers can be sterile liquids, such as water and 
oils, including those of petroleum, animal, vegetable or synthetic origin, such as peanut oil, soybean 
oil, mineral oil, sesame oil and the like. Water is a preferred carrier when the pharmaceutical 
composition is administered intravenously. Saline solutions and aqueous dextrose and glycerol 
solutions can also be employed as liquid carriers, particularly for injectable solutions. Suitable 
pharmaceutical excipients include starch, glucose, lactose, sucrose, gelatin, malt, rice, flour, chalk, 
silica gel, sodium stearate, glycerol monostearate, talc, sodium chloride, dried skim milk, glycerol, 
propylene, glycol, water, ethanol and the like. The composition, if desired, can also contain minor 
amounts of wetting or emulsifying agents, or pH buffering agents. These compositions can take the 
form of solutions, suspensions, emulsion, tablets, pills, capsules, powders, sustained release 
formulations and the like. The composition can be formulated as a suppository, with traditional 
binders and carriers such as triglycerides. Oral formulation can include standard carriers such as 
pharmaceutical grades of mannitol, lactose, starch, magnesium stearate, sodium saccharine, 
cellulose, magnesium carbonate, etc. Examples of suitable phannaceutical carriers are described in 
"Remington's Pharmaceutical Sciences" by E.W. Martin. The formulation should suit the mode of 
administration. 

In a preferred embodiment, the composition is formulated in accordance with 
routine procedures as a pharmaceutical composition adapted for intravenous administration 
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to human beings. Typically, compositions for intravenous administration are solutions in 
sterile isotonic aqueous buffer. Where necessary, the composition may also include a solubilizing 
agent and a local anesthetic such as lignocaine to ease pain at the site of the injection. Generally, the 
ingredients are supplied either separately or mixed together in unit dosage form, for example, as a 
5 dry lyophilized powder or water free concentrate in a hermetically sealed container such as an 
ampoule or sachette indicating the quantity of active agent. Where the composition is to be 
administered by infusion, it can be dispensed with an infusion bottle containing sterile 
pharmaceutical grade water or saline. Where the composition is administered by injection, an 
ampoule of sterile water for injection or saline can be provided so that the ingredients may be mixed 

1 0 prior to administration. 

The pharmaceutical compositions of the invention can be formulated as neutral or 
salt forms. Pharmaceutically acceptable salts include those formed with free amino groups 
such as those derived from hydrochloric, phosphoric, acetic, oxalic, tartaric acids, etc., and 
those formed with free carboxyl groups such as those derived from sodium, potassium, ammonium, 

1 5 calcium, ferric hydroxides, isopropylamine, triethylamine, 2 ethylamino ethanol, histidine, procaine, 
etc. 

The amount of the pharmaceutical composition of the invention which will be 
effective in the treatment of a particular disorder or condition will depend on the nature of 
the disorder or condition, and can be determined by standard clinical techniques. In 

20 addition, in vitro assays may optionally be employed to help identify optimal dosage ranges. 
The precise dose to be employed in the formulation will also depend on the route of 
administration, and the seriousness of the disease or disorder, and should be decided 
according to the judgment of the practitioner and each patient's circumstances. However, 
suitable dosage ranges for intravenous administration are generally about 20 500 

25 micrograms of active compound per kilogram body weight. Suitable dosage ranges for 
intranasal administration are generally about 0.01 pg/kg body weight to 1 mg/kg body 
weight. Effective doses may be extrapolated from dose response curves derived from in 
vitro or animal model test systems. 

Suppositories generally contain active ingredient in the range of 0.5% to 10% by 

30 weight; oral formulations preferably contain 10% to 95% active ingredient. 

The invention also provides a pharmaceutical pack or kit comprising one or more 
containers filled with one or more of the ingredients of the pharmaceutical compositions of 
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the invention. Optionally associated with such container(s) can be a notice in the form 
prescribed by a governmental agency regulating the manufacture, use or sale of pharmaceuticals or 
biological products, which notice reflects approval by the agency of manufacture, use or sale for 
human administration. In a preferred embodiment, the kit contains an anti-viral agent of the 
5 invention, e.g., an antibody specific for the polypeptides encoded by a nucleotide sequence 
of SEQ ID NO: 1, 11, 13, or 15, or as shown in Figures 11 (SEQ ID NOS: 17-239, 241-736 
and 738-1107) and 12 (SEQ ID NOS: 1109-1 589, 1591-1964 and 1966-2470), or any 
hS ARS epitope, or a polypeptide or protein of the present invention, or a nucleic acid 
molecule of the invention, alone or in combination with adjuvants, antivirals, antibiotics, 
10 analgesic, broncho dialaters, or other pharmaceutically acceptable excipients. 

The present invention further encompasses kits comprising a container containing a 
pharmaceutical composition of the present invention and instructions to for use. 



5.7 Detection Assays 

1 5 The present invention provides a method for detecting an antibody, which 

immunospecifically binds to the hSARS virus, in a biological sample, for example blood, 
serum, plasma, saliva, urine, etc., from a patient suffering from SARS. In a specific 
embodiment, the method comprising contacting the sample with the hSARS virus, for 
example, of deposit no. CCTCC-V200303, or having a genomic nucleic acid sequence of 

20 SEQ ID NO: 15, directly immobilized on a substrate and detecting the virus-bound antibody 
directly or indirectly by a labeled heterologous anti-isotype antibody. In another specific 
embodiment, the sample is contacted with a host cell which is infected by the hS ARS virus, 
for example, of deposit no. CCTCC-V200303, or having a genomic nucleic acid sequence 
of SEQ ID NO: 15, and the bound antibody can be detected by immunofluorescent assay as 

25 described in Section 6.5, infra. 

An exemplary method for detecting the presence or absence of a polypeptide or 
nucleic acid of the invention in a biological sample involves obtaining a biological sample 
from various sources and contacting the sample with a compound or an agent capable of 
detecting an epitope or nucleic acid (e.g., mRNA, genomic DNA) of the hS ARS virus such 

30 that the presence of the hSARS virus is detected in the sample. A preferred agent for 

detecting hS ARS mRNA or genomic RNA of the invention is a labeled nucleic acid probe 
capable of hybridizing to mRNA or genomic RNA encoding a polypeptide of the invention. 
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The nucleic acid probe can be, for example, a nucleic acid molecule comprising or 
consisting of the nucleotide sequence of SEQIDNO:l, 11, 13, or 15, or a portion thereof, 
such as an oligonucleotide of at least 15, 20, 25, 30, 50, 100, 250, 500, 750, 1,000 or more 
contiguous nucleotides in length and sufficient to specifically hybridize under stringent 
5 conditions to a hSARS mRNA or genomic RNA. 

In another preferred specific embodiment, the presence of hSARS virus is detected 
in the sample by an reverse transcription polymerase chain reaction (RT-PCR) using the 
primers that are constructed based on a partial nucleotide sequence of the genome of 
hSARS virus, for example, that of deposit accession no. CCTCC-V200303, or having a 

1 0 genomic nucleic acid sequence of SEQ ID NO: 1 5, or based on a nucleotide sequence of 
SEQ ID NO:l, 11, 13, or 15. In a non-limiting specific embodiment, preferred primers to 
be used in a RT-PCR method are: 5'-TACACACCTCAGC-GTTG-3' (SEQ ID NO:3) and 
5 ' -C ACG A ACGTGACG- AAT-3 ' (SEQ ID NO: 4), in the presence of 2.5 mM MgCl 2 and 
the thermal cycles are, for example, but not limited to, 94 °C for 8 min followed by 40 

15 cycles of 94 °C for 1 min, 50 °C for 1 min, 72 °C for 1 min {also see Section 6.7, infra). 
In more preferred specific embodiment, the present invention provides a real-time 
quantitative PCR assay to detect the presence of hSARS virus in a biological sample by 
subjecting the cDNA obtained by reverse transcription of the extracted total RNA from the 
sample to PCR reactions using the specific primers, such as those having nucleotide 

20 sequences of SEQ ID NOS:3 and 4, and a fluorescence dye, such as SYBR® Green I, which 
fluoresces when bound non-specifically to double-stranded DNA. The fluorescence signals 
from these reactions are captured at the end of extension steps as PCR product is generated 
over a range of the thermal cycles, thereby allowing the quantitative determination of the 
viral load in the sample based on an amplification plot {see Section 6.7, infra). 

25 A preferred agent for detecting hSARS is an antibody that specifically binds a 

polypeptide of the invention or any hSARS epitope, preferably an antibody with a 
detectable label. Antibodies can be polyclonal, or more preferably, monoclonal. An intact 
antibody, or a fragment thereof (e.g., Fab or F(ab')2) can be used. 

The term "labeled", with regard to the probe or antibody, is intended to encompass 

30 direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable 
substance to the probe or antibody, as well as indirect labeling of the probe or antibody by 
reactivity with another reagent that is directly labeled. Examples of indirect labeling 
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include detection of a primary antibody using a fluorescently labeled secondary antibody 
and end-labeling of a DNA probe with biotin such that it can be detected with fluorescently 
labeled streptavidin. The detection method of the invention can be used to detect mRNA, 
protein (or any epitope), or genomic RNA in a sample in vitro as well as in vivo. For 
5 example, in vitro techniques for detection of mRNA include northern hybridizations, in situ 
hybridizations, RT-PCR, and RNase protection. In vitro techniques for detection of an 
epitope of hSARS include enzyme linked immunosorbent assays (ELISAs), Western blots, 
immunoprecipitations and immunofluorescence. In vitro techniques for detection of 
genomic RNA include nothern hybridizations, RT-PCT, and RNase protection. 

10 Furthermore, in vivo techniques for detection of hSARS include introducing into a subject 
organism a labeled antibody directed against the polypeptide. For example, the antibody 
can be labeled with a radioactive marker whose presence and location in the subject 
organism can be detected by standard imaging techniques, including autoradiography. 

In a specific embodiment, the methods further involve obtaining a control sample 

15 from a control subject, contacting the control sample with a compound or agent capable of 
detecting hSARS, e.g., a polypeptide of the invention or mRNA or genomic RNA encoding 
a polypeptide of the invention, such that the presence of hSARS or the polypeptide or 
mRNA or genomic RNA encoding the polypeptide is detected in the sample, and comparing 
the absence of hSARS or the polypeptide or mRNA or genomic RNA encoding the 

20 polypeptide in the control sample with the presence of hSARS, or the polypeptide or mRNA 
or genomic DNA encoding the polypeptide in the test sample. 

The invention also encompasses kits for detecting the presence of hSARS or a 
polypeptide or nucleic acid of the invention in a test sample. The kit, for example, can 
comprise a labeled compound or agent capable of detecting hSARS or the polypeptide or a 

25 nucleic acid molecule encoding the polypeptide in a test sample and, in certain 

embodiments, a means for determining the amount of the polypeptide or mRNA in the 
sample (e.g., an antibody which binds the polypeptide or an oligonucleotide probe which 
binds to DNA or mRNA encoding the polypeptide). Kits can also include instructions for 
use. 

30 For antibody-based kits, the kit can comprise, for example: (1) a first antibody (e.g., 

attached to a solid support) which binds to a polypeptide of the invention or hSARS epitope; 
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and, optionally, (2) a second, different antibody which binds to either the polypeptide or the 
first antibody and is conjugated to a detectable agent. 

For oligonucleotide-based kits, the kit can comprise, for example: (1) an 
oligonucleotide, e.g., a detectably labeled oligonucleotide, which hybridizes to a nucleic 
5 acid sequence encoding a polypeptide of the invention or to a sequence within the hSARS 
genome or (2) a pair of primers useful for amplifying a nucleic acid molecule containing an 
hSARS sequence. The kit can also comprise, e.g., a buffering agent, a preservative, or a 
protein stabilizing agent. The kit can also comprise components necessary for detecting the 
detectable agent (e.g., an enzyme or a substrate). The kit can also contain a control sample 
10 or a series of control samples which can be assayed and compared to the test sample 

contained. Each component of the kit is usually enclosed within an individual container and 
all of the various containers are within a single package along with instructions for use. 

5.8 Screening Assays to Identify Anti- Viral Agents 

15 The invention provides methods for the identification of a compound that inhibits 

the ability of hSARS virus to infect a host or a host cell. In certain embodiments, the 
invention provides methods for the identification of a compound that reduces the ability of 
hSARS virus to replicate in a host or a host cell. Any technique well-known to the skilled 
artisan can be used to screen for a compound that would abolish or reduce the ability of 

20 hS ARS virus to infect a host and/or to replicate in a host or a host cell. 

In certain embodiments, the invention provides methods for the identification of a 
compound that inhibits the ability of hSARS virus to replicate in a mammal or a 
mammalian cell. More specifically, the invention provides methods for the identification of 
a compound that inhibits the ability of hSARS virus to infect a mammal or a mammalian 

25 cell. In certain embodiments, the invention provides methods for the identification of a 
compound that inhibits the ability of hSARS virus to replicate in a mammalian cell. In a 
specific embodiment, the mammalian cell is a human cell. 

In another embodiment, a cell is contacted with a test compound and infected with 
the hSARS virus. In certain embodiments, a control culture is infected with the hSARS 

30 virus in the absence of a test compound. The cell can be contacted with a test compound 
before, concurrently with, or subsequent to the infection with the hSARS virus. In a 
specific embodiment, the cell is a mammalian cell. In an even more specific embodiment, 

40 



WO 2004/085633 



PCT/CN2004/000248 



the cell is a human cell. In certain embodiments, the cell is incubated with the test 
compound for at least 1 minute, at least 5 minutes at least 15 minutes, at least 30 minutes, at 
least 1 hour, at least 2 hours, at least 5 hours, at least 12 hours, or at least 1 day. The titer of 
the virus can be measured at any time during the assay. In certain embodiments, a time 
5 course of viral growth in the culture is determined. If the viral growth is inhibited or 
reduced in the presence of the test compound, the test compound is identified as being 
effective in inhibiting or reducing the growth or infection of the hSARS virus. In a specific 
embodiment, the compound that inhibits or reduces the growth of the hSARS virus is tested 
for its ability to inhibit or reduce the growth rate of other viruses to test its specificity for the 
10 hSARS virus. 

In one embodiment, a test compound is administered to a model animal and the 
model animal is infected with the hSARS virus. In certain embodiments, a control model 
animal is infected with the hSARS virus without the administration of a test compound. 
The test compound can be administered before, concurrently with, or subsequent to the 

1 5 infection with the hS ARS virus. In a specific embodiment, the model animal is a mammal. 
In an even more specific embodiment, the model animal can be, but is not limited to, a 
cotton rat, a mouse, or a monkey. The titer of the virus in the model animal can be 
measured at any time during the assay. In certain embodiments, a time course of viral 
growth in the culture is determined. If the viral growth is inhibited or reduced in the 

20 presence of the test compound, the test compound is identified as being effective in 
inhibiting or reducing the growth or infection of the hSARS virus. In a specific 
embodiment, the compound that inhibits or reduces the growth of the hSARS in the model 
animal is tested for its ability to inhibit or reduce the growth rate of other viruses to test its 
specificity for the hSARS virus. 

25 

6. EXAMPLES 

The following examples illustrate the isolation and identification of the novel 
hSARS virus. These examples should not be construed as limiting. 

30 METHODS AND RESULTS 



41 



WO 2004/085633 



PCT/CN2004/000248 



As a general reference, Wiedbrauk DL & Johnston SLG. (Manual of Clinical 
Virology, Raven Press, New York, 1993) was used. 

6.1 Clinical Subjects 

The study included all 50 patients who fitted a modified World Health Organization 
(WHO) definition of SARS and were admitted to 2 acute regional hospitals in Hong Kong 
Special Administrative Region (HKSAR) between February 26 to March 26, 2003 (WHO. 
Severe acute respiratory syndrome (SARS) Weekly Epidemiol Rec. 2003; 78: 81-83). A 
lung biopsy from an additional patient, who had typical SARS and was admitted to a third 
hospital, was also included in the study. Briefly, the case definition for SARS was: (i) fever 
of 38°C or more; (ii) cough or shortness of breath; (hi) new pulmonary infiltrates on chest 
radiograph; and (iv) either a history of exposure to a patient with SARS or absence of 
response to empirical antimicrobial coverage for typical and atypical pneumonia (beta- 
lactams and macrolides, fluoroquinolones or tetracyclines). 

Nasopharyngeal aspirates and serum samples were collected from all patients. 
Paired acute and convalescent sera and feces were available from some patients. Lung 
biopsy tissue from one patient was processed for a viral culture, RT-PCR, routine 
histopathological examination, and electron microscopy. Nasopharyngeal aspirates, feces 
and sera submitted for microbiological investigation of other diseases were included in the 
study under blinding and served as controls. 

The medical records were reviewed retrospectively by the attending physicians and 
clinical microbiologists. Routine hematological, biochemical and microbiological 
examinations, including bacterial culture of blood and sputum, serological study and 
collection of nasopharyngeal aspirates for virological tests, were carried out. 

6.2 Cell Line 

FRhK-4 (fetal rhesus monkey kidney) cells were maintained in minimal essential 
medium (MEM) with 1% fetal calf serum, 1% streptomycin and penicillin, 0.2% nystatin 
and 0.05% garamycin. 

6.3 Viral Infection 
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Two-hundred |ul of clinical (nasopharyngeal aspirates) samples, from two patients 
{see the Result section, infra), in virus transport medium were used to infect FRhk-4 cells. 
The inoculated cells were incubated at 37°C for 1 hour. One ml of MEM containing 1 jug 
trypsin was then added to the culture and the infected cells were incubated in a 37°C 
5 incubator supplied with 5% carbon dioxide. Cytopathic effects were observed in the 
infected cells after 2 to 4 days of incubation. The infected cells were passaged into new 
FRhK-4 cells and cytopathic effects were observed within 1 day after the inoculation. The 
infected cells were tested by an immunofluorescent assay for influenza A., influenza B, 
respiratory syncytial virus, parainfluenza types 1, 2 and 3, adenovirus and human 
10 metapneumo virus (hMPV) and negative results were obtained for all cases. The infected 
cells were also tested by RT-PCR for influenza A and human metapneumovirus with 
negative results. 



6.4 Virus Morphology 

15 The infected cells prepared as described above were harvested, pelleted by 

centrifugation and the cell pellets were processed for thin-section transmitted electron 
microscopic visualization. Viral particles were identified in the cells infected with both 
clinical specimens, but not in control cells which were not infected with the virus. Virions 
isolated from the infected cells were about 70-100 nanometers (Figure 2). Viral capsids 

20 were found predominantly within the vesicles of the golgi and endoplasmic reticulum and 
were not free in the cytoplasm. Virus particles were also found at the cell membrane. 

One virus isolate was ultracentrifuged and the cell pellet was negatively stained 
using phosphotugstic acid. Virus particles characteristic of Coronaviridae were thus 
visualized. Since the human Coronaviruses hitherto recognized are not known to cause a 

25 similar disease, the present inventors postulated that the virus isolates represent a novel 
virus that infects humans. 



6.5 Antibody Response to the Isolated Virus 

To further confirm that this novel virus is responsible for causing S ARS in the 
30 infected patients, blood serum samples from the patients who were suffering from SARS 

were obtained and a neutralization test was performed. Typically diluted serum (x50, x200, 
x800 and xl600) was incubated with acetone-fixed FRhK-4 cells infected with hSARS at 
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37°C for 45 minutes. The incubated cells were then washed with phosphate-buffered saline 
and stained with anti-human IgG-FITC conjugated antibody. The cells were then washed 
and examined under a fluorescent microscope. In these experiments, positive signals were 
found in 8 patients who had SARS (Figure 3), indicating that these patients had an IgG 
antibody response to this novel human respiratory virus oiCoronaviridae. By contrast, no 
signal was detected in 4 negative-control paired sera. The serum titers of anti- hSARS 
antibodies of the tested patients are shown in Table 1. 







Table 1 




Name 


Date 


Lab No. 


Anti-SARS 


Patient A 


25-Feb-03 


S2728 


<50 




6-Mar-03 


S2728 


1600 


Patient B 


26-Feb-03 


S2441 


50 




3-Mar-03 


S2441 


200 


Patient C 


4-Mar-03 


S3279 


200 




14-Mar-03 


S3279 


1600 


Patient D 


6-Mar-03 


M41045 


<50 




11-Mar-03 


MB943703 


800 


Patient E 


4-Mar-03 


M38953 


<50 




18-Mar-03 


KWH03/3601 


800 


Control F 


13-Feb-03 


M27124 


<50 




1 -Mar-03 


MB942968 


<50 


Patient G 


3-Mar-03 


M38685 


<50 




7-Mar-03 


KWH03/2900 


Equivocal 




Blinded samples: 






<50 


1a * 


Acute 




1b 


Convalescent 




1600 


2a * 


Acute 




50 


2b 


Convalescent 




>1600 


3a * 


Acute 




50 


3b 


Convalescent 




>1600 


4a * 


Acute 




<50 


4b 


Convalescent 




<50 


5a * 


Acute 




<50 


5b 


Convaelscent 




<50 


6a* 


Acute 




<50 


6b 


Convalescent 




<50 



NB: * patients with SARS 
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These results indicated that this novel member of Coronaviridae is a key pathogen 
in SARS. 

6.6 Sequences of the hSARS Virus 

5 Total RNA from infected or uninfected FrHK-4 cells was harvested two days post- 

infection. One-hundred ng of purified RNA was reverse transcribed using Superscript® II 
reverse transcriptase (Invitrogen) in a 20 jul reaction mixture containing 10 pg of a 
degenerated primer (5 ? -GCCGGAGCTCTGCAGAATTCNNNNNNN-3': SEQ ID NO:5; 
N=A, T, G or C) as recommended by the manufacturer. Reverse transcribed products were 

1 0 then purified by a QIAquick® PGR purification kit as instructed by the manufacturer and 

eluted in 30 jn.1 of 10 mM Tris-HCl, pH 8.0 . Three |li1 of purified cDNA products were add 
in a 25 jliI reaction mixture containing 2.5 |ul of lOx PCR buffer, 4 \il of 25mM MgCl 2 , 0.5 
(il of 10 mM dNTP, 0.25 |ul of AmpliTaq Gold® DNA polymerase (Applied Biosystems), 
2.5 nCi of [a- 32 P]CTP (Amersham), 2 \xl of 10 |uM primer (5 5 - 

15 GCCGGAGCTCTGCAGAATT-C-3 ' : SEQ ID NO:6). Reactions were thermal cycled 
through the following profile: 94°C for 8 min followed by 2 cycles of 94°C for 1 min, 
40°C for 1 min, 72°C for 2 min. This temperature profile was followed by 35 cycles of 
94°C for 1 min, 60°C for 1 min, 72°C for 1 min. 6 jlxI of the PCR products were analyzed in 
a 5% denaturing polyacrylamide gel electrophoresis. Gel was exposed to X-ray film and the 

20 film was developed after an over-night exposure. Unique PCR products which were only 
identified in infected cell samples were isolated from the gel and eluted in a 50 jlxI of lx TE 
buffer. Eluted PCR products were then re-amplified in 25 jlxI of reaction mixture containing 
2.5 \i\ of lOx PCR buffer, 4 |il of 25 mM MgCl 2 , 0.5 ^1 ru 10 mM dNTP, 0.25 \x\ of 
AmpliTaq Gold® DNA polymerase (Applied Biosy stems), 1 of 10 |iM primer (5 ? - 

25 GCCGGAGCTCTGCAGAATTC-3 5 :SEQ ID NO:6). Reaction mixtures were thermal 

cycled through the following profile: 94°C for 8 min followed by 35 cycles of 94°C for 1 
min, 60°C for 1 min, 72°C for 1 min. PCR products were cloned using a TOPO TA 
Cloning® kit (Invitrogen) and ligated plasmids were transformed into TOP 10 E. coli 
competent cells (Invitrogen). PCR inserts were sequenced by a BigDye cycle sequencing 

30 kit as recommended by the manufacturer (Applied Biosystems) and sequencing products 
were analyzed by an automatic sequencer (Applied Biosystems, model number 3770). The 
obtained sequence (SEQ ID NO:l) is shown in Figure 1. The deducted amino acid 
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sequence (SEQ ID NO:2) from the obtained DNA sequence showed 57% homology to the 
polymerase protein of identified coronaviruses. 

Similarly, two other partial sequences (SEQ ID NOS:l 1 and 13) and deduced amino 
acid sequences (SEQ ID NOS:12 and 14, respectively) were obtained from the hSARS virus 
5 and are shown in Figures 8 (SEQ IDNOS:ll and 12) and 9 (SEQ IDNOS:13 and 14). 

The entire genomic sequence of hSARS virus is shown in Figure 10 (SEQ ID 
NO: 15). The deduced amino acid sequences of SEQ ID NO: 15 in all three frames are 
shown in Figure 1 1 (nucleotide sequences shown in SEQ ID NOS: 16, 240 and 737; for 
amino acid sequences, see SEQ ID NO: 17-239, 241-736 and 738-1 107). The deduced 
10 amino acid sequences of the complement of SEQ ID NO: 15 in all three frames are shown in 
Figure 12 (nucleotide sequences shown in SEQ NOS:l 108, 1590 and 1965; for amino acid 
sequences, see SEQ ID NOS: 1109-1589, 1591-1964 and 1966-2470). 

6.7 Detection of hSARS Virus in Nasopharyngeal Aspirates 

15 First, the nasopharyngeal aspirates (NPA) were examined by rapid 

immunoflourescent antigen detection for influenza A and B, parainfluenza types 1, 2 and 3, 
respiratory syncytial virus and adenovirus (Chan KH, Maldeis N, Pope W, Yup A, Ozinskas 
A. Gill J, Seto WH, Shortridge KF, Peiris JSM. Evaluation of Directigen Fly A+B test for 
rapid diagnosis of influenza A and B virus infections. J Clin Microbiol 2002; 40: 1675- 

20 1680) and were cultured for conventional respiratory pathogens on Mardin Darby Canine 
Kidney, LLC-Mk2, RDE, Hep-2 and MRC-5 cells (Wiedbrauk DL, Johnston SLG. Manual 
of clinical virology. Raven Press, New York. 1993). Subsequently, fetal rhesus kidney 
(FRhk-4) and A-549 cells were added to the panel of cell lines used. Reverse transcription 
polymerase chain reaction (RT-PCR) was performed directly on the clinical specimen for 

25 influenza A (Fouchier RA, Bestebroer TM, Herfst S, Van Der Kemp L, Rimmelzwan GF, 
Osterhaus AD. Detection of influenza A virus from different species by PCR amplification 
of conserved sequences in the matrix gene. J Clin Microbiol 2000; 38: 4096-101) and 
human metapneumovirus (HMPV). The primers used for HMPV were: for first round, 5'- 
AARGTS AATGCATCAGC-3 ' (SEQ ID NO. 7) and 5 ' -C AK ATT YTGCTT ATGCTTTC- 

30 3' (SEQ ID NO:8); and nested primers: 5 ' - AC ACCTGTT AC AAT ACC AGC-3 5 (SEQ ID 
NO: 9) and 5'-GACTTGAGTCCCAGCTCCA-3' (SEQ ID NO: 10). The size of the nested 
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PCR product was 201 bp. An ELISA for mycoplasma was used to screen cell cultures 
(Roche Diagnostics GmbH, Roche, Indianapolis, USA). 



RT-PCR Assay 

5 Subsequent to culturing and genetic sequencing of the hS ARS virus from two 

patients (see Section 6.6, supra), an RT-PCR was developed to detect the hSARS virus 
sequence from NPA samples. Total RNA from clinical samples was reverse transcribed 
using random hexamers and cDNA was amplified using primers 5 ' -TAC AC ACCTC AGC- 
GTTG-3' (SEQ ID NO:3) and 5 ' -C ACGAACGTGACGAAT-3 ' (SEQ ID NO:4), which are 
10 constructed based on the RNA-dependent RNA polymerase-encoding sequence (SEQ ID 
NO: 1) of the hSARS virus in the presence of 2.5 mM MgCl 2 (94 °C for 8 min followed by 
40 cycles of 94 °C for 1 min, 50 °C for 1 min, 72 °C for 1 min). 

The summary of a typical RT-PCR protocol is as follows: 
15 1. RNA extraction 

RNA from 140 jlxI of NPA samples is extracted by QIAquick viral RNA extraction 
kit and is eluted in 50 \xL of elution buffer. 





2. Reverse transcription 




20 


RNA 


11.5 ul 




0.1 MDTT 


2 jal 




5x buffer 


4 (J.1 




10 mM dNTP 


1 ul 




Superscript II, 200 U/ul (Invitrogen) 


1 ul 


25 


Random hexamers, 0.3 ug/ 


0.5 pi 



Reaction condition 42 °C, 50 min 

94 °C, 3 min 
4°C 

30 

3. PCR 

cDNA generated by random primers is amplified in a 50 ul reaction as follows: 
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cDNA 2 Ml 

lOmMdNTP 0.5 pi 

1 Ox buffer 5 jud 

25mMMgCl 2 5 

25 jjM Forward primer 0,5 |utl 

25 jjM Reverse primer 0.5 \x\ 
AmpliTaq Gold® polymerase, 5U/pl (Applied Biosystems) 0.25 \jl 

Water 36.25 \x\ 



Thermal-cycle condition: 95°C ? 10 min, followed by 40 cycles of 95 °C, 1 min; 
50°C 1 min; 72 °C, 1 min. 

4. Primer sequences 

Primers were designed based on the RNA-dependent RNA polymerase encoding 
sequence (SEQ ID NO:l) of the hSARS virus. 

Forward primer: 5' TACACACCTCAGCGTTG 3 ' (SEQ ID NO:3) 
Reverse primer: 5' CACGAACGTGACGAAT 3' (SEQ ID NO:4) 

Product size: 182 bps 

Real-Time Quantitative PCR Assay 

Total RNA from 140 jliI of nasopharyngeal aspirate (NPA) was extracted by 
QIAamp® virus RNA mini kit (Qiagen) as instructed by the manufacturer. Ten pi of eluted 
RNA samples were reverse transcribed by 200 U of Superscript® II reverse transcriptase 
(Invitrogen) in a 20 \xl reaction mixture containing 0.15 |Ug of random hexamers, 10 mmol/L 
DTT, and 0.5 mmol/L dNTP, as instructed. Complementary DNA was then amplified in a 
SYBR® Green I fluorescence reaction (Roche) mixtures. Briefly, 20 \xl reaction mixtures 
containing 2 [il of cDNA, 3.5 mmol/L MgCl 2? 0.25 |amol/L of forward primer (5 ! - 
TACACACCTCAGCGTTG-3 , ; SEQ ID NO:3) and 0.25 jjmol/L reverse primer (5 1 - 
CACGAACGTGACGAAT-3 1 ; SEQ ID NO:4) were thermal-cycled by a Light-Cycler 
(Roche) with the PCR program, [ 95°C, 10 min followed by 50 cycles of 95°C, 10 min; 
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57°C, 5 sec; 72°C 9 sec]. Plasmids containing the target sequence were used as positive 
controls. Fluorescence signals from these reactions were captured at the end of extension 
step in each cycle (see Fig. 7A). To determine the specificity of the assay, PGR products 
(184 base pairs) were subjected to a melting curve analysis at the end of the assay (65°C to 
5 95°C, 0.1 °C per second; see Fig. 7B). 

CLINICAL RESTJT/TS 

Clinical findings: 

All 50 patients with SARS were ethnic Chinese. They represented 5 different 
10 epidemiological^ linked clusters as well as additional sporadic cases fitting the case 

definition. They were hospitalized at a mean of 5 days after the onset of symptoms. The 
median age was 42 years (range of 23 to 74) and the female to male ratio was 1.3. Fourteen 
(28%) were health care workers and five (10%) had a history of visit to a hospital 
experiencing a major outbreak of SARS. Thirteen (26%) patients had household contacts 
15 and 12 (24%) others had social contacts with patients with SARS. Four (8%) had a history 
of recent travel to mainland China. 

The major complaints from most patients were fever (90%) and shortness of breath. 
Cough and myalgia were present in more than half the patients (Table 2). Upper respiratory 
tract symptoms such as rhinorrhea (24%) and sore throat (20%) were present in a minority 
20 of patients. Diarrhea (10%) and anorexia (10%) were also reported. At initial examination, 
auscultatory findings, such as crepitations and decreased air entry, were present in only 38% 
of patients. Dry cough was reported by 62% of patients. All patients had radiological 
evidence of consolidation, at the time of admission, involving 1 zone (in 36), 2 zones (13) 
and 3 zones (1). 

25 

Table 2 



Clinical symptoms 


Number (percentage) 


Fever 


50 (100%) 


Chill or rigors 


37 (74%) 


Cough 


31 (62%) 


Myalgia 


27 (54%) 


Malaise 


25 (50%) 


Running nose 


12 (24%) 


Sore throat 


10 (20%) 
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Shortness of breath 1 0 (20%) 

Anorexia 10 (20%) 

Diarrhea 5 (10%) 

Headache 10(20%) 

Dizziness 6 (12%) 



* Truncal maculopapular rash was noted in 1 patient. 

In spite of the high fever, most patients (98%) had no evidence of a leukocytosis. 
Lymphopenia (68%), leucopenia (26%), thrombocytopenia (40%) and anemia (18%) were 
5 present in peripheral blood examination (Table 3). Parenchymal liver enzyme, alanine 

aminotransferase (ALT) and muscle enzyme, creatinine kinase (CPK) were elevated in 34% 
and 26% respectively. 

Table 3 

10 



Laboratory parameter 


Mean (range) 


Percentage of bnormal 


Normal range 


Haemoglobin 


12.9(8.9-15.9) 




11.5-16.5 g/dl 


Anaemia 




9 (18%) 




White cell count 


5.17 (1.1-11.4) 




4- llxl0 9 /L 


Leucopenia 




13 (26%) 




Lymphocyte count 


0.78 (0.3 - 1.5) 




1.5-4.0xl0 9 /L 


Significant lymphopenia 




34 (68%) 




(<1.0xl0 9 /L) 








Platelet count 


174 (88-351) 




150 -400xl0 9 /L 


Thrombocytopenia 




20 (40%) 




Alanine aminotransaminase (ALT) 


63 (11 -350) 




6 - 53 U/L 


Elevated ALT 




17 (34%) 




Albumin 


37 (26 - 50) 




42 - 54 g/L 


Low albumin 




34 (68%) 




Globulin 


33 (21 -42) 




24-36g/L 


Elevated globulin 




10 (20%) 




Creatinine kinase 


244 (31 - 1379) 




34 -138 U/L 
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Elevated creatinine kinase 13 (26%) 



Routine microbiological investigations for known viruses and bacteria by culture, 
antigen detection, and PGR were negative in most cases. Blood culture was positive for 
Escherichia coli in a 74-year-old male patient, who was admitted to intensive care unit, and 
was attributed to hospital acquired urinary tract infection. Klebsiella pneumoniae and 
Hemophilus influenzae were isolated from the sputum specimens of 2 other patients on 
admission. 

Oral levofloxacin 500 mg q24h was given in 9 patients and intravenous (1.2 g q8h)/ 
oral (375 mg tid) amoxicillin-clavulanate and intravenous/oral clarithromycin 500 mg ql2h 
were given in another 40 patients. Four patients were given oral oseltamivir 75 mg bid. In 
one patient, intravenous ceftriaxone 2 gm q24h, oral azithromycin 500 mg q24h, and oral 
amantadine 100 mg bid were given for empirical coverage of typical and atypical 
pneumonia. 

Nineteen patients progressed to severe disease with oxygen desaturation and were 
required intensive care and ventilatory support. The mean number of days of deterioration 
from the onset of symptoms was 8.3 days. Intravenous ribavirin 8 mg/kg q8h and steroid 
was given in 49 patients at a mean day of 6.7 after onset of symptoms. 

The risk factors associated with severe complicated disease requiring intensive care 
and ventilatory support were older age, lymphopenia, impaired ALT, and delayed initiation 
of ribavirin and steroid (Table 4). All the complicated cases were treated with ribavirin and 
steroid after admission to the intensive care unit whereas all the uncomplicated cases were 
started on ribavirin and steroid in the general ward. As expected, 3 1 uncomplicated cases 
recovered or improved whereas 8 complicated cases deteriorated with one death at the time 
of writing. All 50 patients were monitored for a mean of 12 days at the time of writing. 
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Table 4 





v^oinp JUL cl tcu 


u ncompiicdteu 


r Value 






vdoC 






/ 1 Q\ 
I 11 Ay J 


[n- 31) 




Mean (o-U) age (rangej 


4V.5 ± 12. / 


39.0 ± 10.7 


13 <f n m 
Jr U.U1 


Male / Female ratio 


Q 1 1 1 
O / 1 1 


14 / 17 


JN.o. 


Underlying illness 


C t 


1 * 


T> A AC 
JP < 0.05 


Mode of contact 








Travel to China 


1 
1 


Z> 


J.N.O. 


Health care worker 


C 

D 


Q 


"NT C 

JN.o. 


Hospital visit 


i 
1 


A 
H 


AT C 
IN.O. 


riousenoia contact 


o 
o 


c 


t> <r n ac 


Social contact 




i n 
iu 


XT Q 
IN.O. 


Mean (SD) duration of symptoms to 


5.2 ± 2.0 


4.7 ± 2.5 


"XT C 

JN.o. 


admission (days) 








iviean voj-',} aumission temperature ^ 




Jo. / ± U.O 


IN.O. 


Mean volJ) initial total peripneral Wrsc 
count (x iu / 


5.1 ± 2.4 


C O _L_ 1 O 

5.2 ± 1.8 


XT C 

JN.o. 








Mean (SD) initial lymphocyte count 
(x 10 / L) 


0.66 ± 0.3 


0.85 ± 0.3 


Ti ^ A AC 

Y < 0.05 








Presence of thrombocytopenia 
(< 150 x 10 / L) 


8 


12 


"XT C 1 

N.S. 








Impaired liver function test 


1 1 


r 

o 


P < 0.01 


CXR changes (number of zone affected) 


1 A 

1.4 


1.2 


N.S. 


Mean (SD) day of deterioration from the 


8.3 ±2.6 


Not applicable 




onset of symptoms § 








Mean (SD) day of initiation of Ribavirin 


7.7 ±2.9 


5.7 ±2.6 


P < 0.05 


& steroid from the onset of symptoms 








Initiation of ribavirin & steroid after 


12 


0 


P< 0.001 


deterioration 








Response to ribavirin & steroid 


11 


28 


P < 0.05 


Outcome 








Improved or recovered 


10 


31 


P<0.01 


Not improving || 


8 


0 


P<0.01 



Multi-variant analysis is not performed due to low number of cases; 



f 2 patients had diabetic mellitus ? 1 had hypertrophic ostructive cardiomyopathy, 1 

had chronic active hepatitis B ? and 1 had brain tumour; 
* 1 patient had essential hypertension; 
§ desaturation requiring intensive care support; 
|| 1 died. 
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Two virus isolates, subsequently identified as a member of Coronaviridae {see 
below), were isolated from two patients. One was from an open lung biopsy tissue of a 53- 
year-old Hong Kong Chinese resident and the other from a nasopharyngeal aspirate of a 42 
year-old female with good previous health. The 53-year old male had a history of 10-hour 
5 household contact with a Chinese visitor who came from Guangzhou and later died from 
SARS. Two days after this exposure, he presented with fever, malaise, myalgia, and 
headache. Crepitations were present over the right lower zone and there was a 
corresponding alveolar shadow on the chest radiograph. Hematological investigation 
revealed lymphopenia of 0.7 x 109/L with normal total white cell and platelet counts. Both 
10 ALT (41 U/L) and CPK (405 U/L) were impaired. Despite a combination of oral 

azithromycin, amantadine, and intravenous ceftriaxone, there was increasing bilateral 
pulmonary infiltrates and progressive oxygen desaturation. Therefore, an open lung biopsy 
was performed 9 days after admission. Histopathological examination showed a mild 
interstitial inflammation with scattered alveolar pneumocytes showing cytomegaly, granular 
15 amphophilic cytoplasm and enlarged nuclei with prominent nucleoli. No cells showed 

inclusions typical of herpesvirus or adenovirus infection. The patient required ventilation 
and intensive care after the operative procedure. Empirical intravenous ribavirin and 
hydrocortisone were given. He succumbed 20 days after admission. In retrospect, 
coronavirus-like RNA was detected in his nasopharyngeal aspirate, lung biopsy and post- 
20 mortem lung. He had a significant rise in titer of antibodies against his own hSARS isolate 
from 1/200 to 1/1600. 

The second patient from whom a hSARS virus was isolated, was a 42-year-old 
female with good past health. She had a history of travel to Guangzhou in mainland China 
for 2 days. She presented with fever and diarrhea 5 days after her return to Hong Kong. 
25 Physical examination showed crepitation over the right lower zone which had a 

corresponding alveolar shadow on the chest radiograph. Investigation revealed leucopenia 
(2.7 x 109/L), lymphopenia (0.6 x 109/L), and thrombocytopenia (104 x 109/L). Despite 
the empirical antimicrobial coverage with amoxicillin-clavulanate, clarithromycin, and 
oseltamivir, she deteriorated 5 days after admission and required mechanical ventilation and 
30 intensive care for 5 days. She gradually improved without receiving treatment with 

ribavirin or steroid. Her nasopharyngeal aspirate was positive for the virus in the RT-PCR 
and she was seroconverted from antibody titre <l/50 to 1/1600 against the hSARS isolate. 
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Virological fi ndings: 

Viruses were isolated on FRhk-4 cells from the lung biopsy and nasopharyngeal 
aspirate respectively, of two patients described above. The initial cytopathic effect 
5 appeared between 2 and 4 days after inoculation, but on subsequent passage, cytopathic 
effect appeared in 24 hours. Both virus isolates did not react with the routine panel of 
reagents used to identify virus isolates including those for influenza A, B parainfluenza 
types 1,2,3, adenovirus and respiratory syncytial virus (DAKO, Glostrup, Denmark). They 
also failed to react in RT-PCR assays for influenza A and HMPV or in PGR assays for 

10 mycoplasma. The virus was ether sensitive, indicating that it was an enveloped virus. 

Electron microscopy of negatively stained (2% potassium phospho-tungstate, pH 7.0) cell 
culture extracts obtained by ultracentrifugation showed the presence of pleomorphic 
enveloped viral particles, of about 80-90 nm (ranging 70-130 nm) in diameter, whose 
surface morphology appeared comparable to members of Coronaviridae (Figure 5A). Thin 

15 section electron microscopy of infected cells revealed virus particles of 55-90 nm diameter 
within the smooth-walled vesicles in the cytoplasm (Figure 5A and 5B). Virus particles 
were also seen at the cell surface. The overall findings were compatible with infections in 
the cells caused by viruses of Coronaviridae, 

A thin section electron micrograph of the lung biopsy of the 53 year old male 

20 contained 60-90-nm viral particles in the cytoplasm of desquamated cells. These viral 
particles were similar in size and morphology to those observed in the cell-cultured virus 
isolate from both patients (Figure 4). 

The RT-PCR products generated in a random primer RT-PCR assay were analyzed 
and unique bands found in the virus infected specimen was cloned and sequenced. Of 30 

25 clones examined, a clone containing 646 base pairs (SEQ ID NO: 1) of unknown origin was 
identified. Sequence analysis of this DNA fragment suggested this sequence had a weak 
homology to viruses of the family of Coronaviridae (data not shown). Deducted amino 
acid sequence (215 amino acids: SEQ ID NO:2) from this unknown sequence, however, had 
the highest homology (57%) to the RNA polymerase of bovine coronavirus and murine 

30 hepatitis virus, confirming that this virus belongs to the family of Coronaviridae. 

Phylogenetic analysis of the protein sequences showed that this virus, though most closely 
related to the group II coronaviruses, was a distinct virus (Figures 5 A and 5B). 
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Based on the 646 bp sequence of the isolate, specific primers for detecting the new 
virus was designed for RT-PCR detection of this hS ARS virus genome in clinical 
specimens. Of the 44 nasopharyngeal specimens available from the 50 SARS patients, 22 
had evidence of hSARS RNA. Viral RNA was detectable in 10 of 18 fecal samples tested. 
5 The specificity of the RT-PCR reaction was confirmed by sequencing selected positive RT- 
PCR amplified products. None of 40 nasopharyngeal and fecal specimens from patients 
with unrelated diseases were reactive in the RT-PCR assay. 

To determine the dynamic range of real-time quantitative PGR, serial dilutions of 
plasmid DNA containing the target sequence were made and subjected to the real-time 

10 quantitative PCR assay. As shown in Figure 7 A, the assay was able to detect as little as 10 
copies of the target sequence. By contrast, no signal was observed in the water control 
(Figure 7A). Positive signals were observed in 23 out of 29 serologically confirmed SARS 
patients. In all of these positive cases, a unique PCR product (T m = 82°C) corresponding to 
the signal from the positive control was observed (Figure 7B, and data not shown). These 

15 results indicated this assay is highly specific to the target. The copy numbers of the target 
sequence in these reactions range from 4539 to less than 10. Thus, as high as 6.48 x 10 5 
copies of this viral sequence could be found in 1 ml of NPA sample. In 5 of the above 
positive cases, it was possible to collect NPA samples before seroconvertion. Viral RNA 
was detected in 3 of these samples, indicating that this assay can detect the virus even at the 

20 early onset of infection. 

To further validate the specificity of this assay, NPA samples from healthy 
individuals (n=ll) and patients suffered from adenovirus (n=l 1), respiratory syncytial virus 
(n=l 1), human metapneumo virus (n=l 1), influenza A virus (n=13) or influenza B virus 
(n=l) infection were recruited as negative controls. All of these samples, except one, were 

25 negative in the assay. The false positive case was negative in a subsequence test. Taken 
together, including the initial false positive case, the real-time quantitative PCR assay has 
sensitivity of 79% and specificity of 98 %. 

Epidemiological data suggest that droplet transmission is one of the major route of 
transmission of this virus. The detection of live virus and the detection of high copies of 

30 viral sequence from NPA samples in the current study clearly support that cough and sneeze 
droplets from SARS patients might be the major source of this infectious agent. 
Interestingly, 2 out of 4 available stool samples form the SARA patients in this study were 
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positive in the assay (data not shown). The detection of the virus in feces suggests that 
there might be other routes of transmission. It is relevant to note that a number of animal 
coronaviruses are spread via the fecal-oral route (Mcintosh K., 1974, Coronaviruses: a 
comparative review. Current Top Microbiol Immunol. 63: 85-112). However, further 
5 studies are required to test whether the virus in feces is infectious or not. 

Currently, apart form this hSARS virus, there are two known serogroups of human 
coronaviruses (229E and OC43) (Hruskova J. et ah, 1990, Antibodies to human 
coronaviruses 229E and OC43 in the population of C.R., Acta Virol 34:346-52). The 
primer set used in the present assay does not have homology to the strain 229E. Due to the 

10 lack of available corresponding OC43 sequence in the Genebank, it is not known whether 
these primers would cross-react with this strain. However, sequence analyses of available 
sequences in other regions of OC43 polymerase gene indicate that the novel human virus 
associated with SARS is genetically distinct from OC43. Furthermore, the primers used in 
this study do not have homology to any of sequences from known coronaviruses. Thus, it is 

1 5 very unlikely that these primers would cross-react with the strain OC43 . 

Apart from the novel pathogen, metapneumovirus was reported to be identified in 
some of SARS patients (Center for Disease Control and Prevention, 2003, Morbidity and 
Mortality Weekly Report 52: 269-272). No evidence of metapneumovirus infection was 
detected in any of the patients in this study (data not shown), suggesting that the novel 

20 hSARS virus of the invention is the key player in the pathogenesis of SARS. 



Immunofluorescent antibody detection : 

Thirty-five of the 50 most recent serum samples from patients with SARS had 
evidence of antibodies to the hSARS (see Fig. 3). Of 27 patients from whom paired acute 

25 and convalescent sera were available, all were seroconverted or had >4 fold increase in 
antibody titer to the virus. Five other pairs of sera from additional SARS patients from 
clusters outside this study group were also tested to provide a wider sampling of SARS 
patients in the community and all of them were seroconverted. None of 80 sera from 
patients with respiratory or other diseases as well as none of 200 normal blood donors had 

30 detectable antibody. 

When either seropositivity to HP-CV in a single serum or viral RNA detection in the 
NPA or stool are considered evidence of infection with the hSARS, 45 of the 50 patients 
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had evidence of infection. Of the 5 patients without any virological evidence of 
Coronaviridae viral infection, only one of these patients had their sera tested > 14 days after 
onset of clinical disease. 

5 DTSCTTSSION 

The outbreak of SARS is unusual in a number of aspects, in particular, in the 
appearance of qlusters of patients with pneumonia in health care workers and family 
contacts. In this series of patients with SARS, investigations for conventional pathogens of 
atypical pneumonia proved negative. However, a virus that belongs to the family 

10 Coronaviridae was isolated from the lung biopsy and nasopharyngeal aspirate obtained 

from two SARS patients, respectively. Phylogenetically, the virus was not closely related to 
any known human or animal coronavirus or torovirus. The present analysis is based on a 
646 bp fragment (SEQ ID NO: 1) of the polymerase gene and the entire genome of the 
isolated hS ARS virus, which indicates that the virus relates to antigenic group 2 of the 

15 coronaviruses along with murine hepatitis virus and bovine coronavirus. However, viruses 
of the Coronaviridae can undergo heterologous recombination within the virus family and 
genetic analysis of other parts of the genome needs to be carried out before the nature of this 
new virus is more conclusively defined (Holmes KV. Coronaviruses. Eds Knipe DM, 
Howley PM Fields Virology, 4th Edition, Lippincott Williams & Wilkins, Philadelphia, 

20 1 187-1203). The biological, genetic and clinical data, taken together, indicate that the new 
virus is not one of the two known human coronaviruses. 

The majority (90%) of patients with clinically defined SARS had either serological 
or RT-PCR evidence of infection by this virus. In contrast, neither antibody nor viral RNA 
was detectable in healthy controls. All 27 patients from whom acute and convalescent sera 

25 were available demonstrated rising antibody titers to hSARS virus, strengthening the 

contention that a recent infection with this virus is a necessary factor in the evolution of 
SARS. In addition, all five pairs of acute and convalescent sera tested from patients from 
other hospitals in Hong Kong also showed seroconversion to the virus. The five patients 
who has not shown serological or virological evidence of hSARS virus infection, need to 

30 have later convalescent sera tested to define if they are also seroconverted. However, the 
concordance of the hS ARS virus with the clinical definition of SARS appears remarkable, 
given that clinical case definitions are never perfect. 
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No evidence of HMPV infection, either by RT-PCR or rising antibody titer against 
HMPV, was detected in any of these patients. No other pathogen was consistently detected 
in our group of patients with SARS. It is therefore highly likely that that this hSARS virus 
is either the cause of SARS or a necessary pre-requisite for disease progression. Whether or 
5 not other microbial or other co-factors play a role in progression of the disease remains to 
be investigated. 

The family Coronaviridae includes the genus Coronavirus and Torovirus. They are 
enveloped RNA viruses which cause disease in humans and animals. The previously 
known human coronaviruses, types 229E and OC43 are the major causes of the common 

10 cold (Holmes KV. Coronaviruses. Eds Knipe DM, Howley PM Fields Virology, 4th 

Edition, Lippincott Williams feWilkins, Philadelphia, 1187-1203). But, while they can 
occasionally cause pneumonia in older adults, neonates or immunocompromised patient 
(El-Sahly HM, Atmar RL, Glezen WP, Greenberg SB. Spectrum of clinical illness in 
hospitalizied patients with "common cold" virus infections. Clin Infect Dis. 2000; 31: 96- 

15 100; and Foltz EJ, Elkordy MA. Coronavirus pneumonia following autologous bone 

marrow transplantation for breast cancer. Chest 1999; 115: 901-905), Coronaviruses have 
been reported to be an important cause of pneumonia in military recruits, accounting for up 
to 30% of cases in some studies (Wenzel RP, Hendley JO, Davies JA, Gwaltney JM, 
Coronavirus infections in military recruits: Three-year study with coronavirus strains OC43 

20 and 229E. Am Rev Respir Dis. 1974; 109: 621-624). Human coronaviruses can infect 
neurons and viral RNA has been detected in the brain of patients with multiple sclerosis 
(Talbot PJ, Cote G, Arbour N. Human coronavirus OC43 and 229E persistence in neural 
cell cultures and human brains. Adv Exp Med Biol. - in press). On the other hand, a 
number of animal coronaviruses (eg. Porcine Transmissible Gastroenteritis Virus, Murine 

25 Hepatitis Virus, Avian Infectious Bronchititis Virus) cause respiratory, gastrointestinal, 
neurological or hepatic disease in their respective hosts (Mcintosh K. Coronaviruses: a 
comparative review. Current Top Microbiol Immunol 1974; 63: 85-112). 

We describe for the first time the clinical presentation and complications of SARS. 
Less than 25% of patients with coronaviral pneumonia had upper respiratory tract 

30 symptoms. As expected in atypical pneumonia, both respiratory symptoms and positive 
auscultatory findings were very disproportional to the chest radiographic findings. 
Gastrointestinal symptoms were present in 10%. It is relevant that the virus RNA is detected 
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in faeces of some patients and that coronavirases have been associated with diarrhoea in 
animals and humans (Caul EO, Egglestone SI. Further studies on human enteric 
coronaviruses^rc/2 Virol 1977; 54: 107-17). The high incidence of deranged liver function 
test, leucopenia, significant lymphopenia, thrombocytopenia and subsequent evolution into 
5 adult respiratory distress syndrome suggests a severe systemic inflammatory damage 
induced by this hSARS virus. Thus immuno-modulation by steroid may be important to 
complement the antiviral therapy by ribavirin. In this regard, it is pertinent that severe 
human disease associated with the avian influenza subtype H5N1, another virus that 
recently crossed from animals to humans, has also been postulated to have an immuno- 

10 pathological component (Cheung CY, Poon LLM, Lau ASY et al. Induction of 

proinflammatory cytokines in human macrophages by influenza A (H5N1) viruses: a 
mechanism for the unusual severity of human disease. Lancet 2002; 360: 183 1-1837). In 
common with H5N1 disease, patients with severe SARS are adults, are significantly more 
lymphopenic and have parameters of organ dysfunction beyond the respiratory tract (Table 

1 5 4) (Yuen KY, Chan PKS, Peiris JSM, et al. Clinical features and rapid viral diagnosis of 

human disease associated with avian influenza A H5N1 virus. Lancet 1998; 351: 467-471). 
It is important to note that a window of opportunity of around 8 days exists from the onset 
of symptoms to respiratory failure. Severe complicated cases are strongly associated with 
both underlying disease and delayed use of ribavirin and steroid therapy. Following our 

20 clinical experience in the initial cases, this combination therapy was started very early in 
subsequent cases which were largely uncomplicated cases at the time of admission. The 
overall mortality at the time of writing is only 2% with this treatment regimen. There were 
still 8 out of 19 complicated cases who had not shown significant response. It is not 
possible to a detail analysis of the therapeutic response to this combination regimen due to 

25 the heterogeneous dosing and time of initiation of therapy. 

Other factors associated with severe disease is acquisition of the disease through 
household contact which may be attributed to a higher dose or duration of viral exposure 
and the presence of underlying diseases. 

The clinical description reported here pertains largely to the more severe cases 

30 admitted to hospital. We presently have no data on the full clinical spectrum of the 
emerging Coronmnridae infection in the community or in an out-patient-setting. The 
availability of diagnostic tests as described here will help address these questions. In 
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addition, it will allow questions pertaining to the period of virus shedding (and 
communicability) during convalescence, the presence of virus in other body fluids and 
excreta and the presence of virus shedding during the incubation period, to be addressed. 
The epidemiological data at present appears to indicate that the virus is spread by 
5 droplets or by direct and indirect contact although airborne spread cannot be ruled out in 
some instances. The finding of infectious virus in the respiratory tract supports this 
contention. Preliminary evidence also suggests that the virus may be shed in the feces. 
However, it is important to note that detection of viral RNA does not prove that the virus is 
viable or transmissible. If viable virus is detectable in the feces, this would be a potentially 
10 additional route of transmission that needs to be considered. It is relevant to note that a 
number of animal coronaviruses are spread via the fecal-oral route (Mcintosh K. 
Coronaviruses: a comparative review. Current Top Microbiol Immunol. 1974; 63: 85-112). 

In conclusion, this report provides evidence that a virus in the Coronaviridae family 
is the etiological agent of SARS. 

15 

7. DEPOSIT 

A sample of isolated hSARS virus was deposited with China Center for Type 
Culture Collection (CCTCC) at Wuhan University, Wuhan 430072 in China on April 2, 
2003 in accordance with the Budapest Treaty on the Deposit of Microorganisms, and 
20 accorded accession No. CCTCC- V2003 03 , which is incorporated herein by reference in its 
entirety. 



8. MARKET POTENTIAL 

The hS ARS virus can now be grown on a large scale, which allows the development 
25 of various diagnostic tests as described hereinabove as well as the development of vaccines 
and antiviral agents that are effective in preventing, ameliorating or treating SARS. Given 
the severity of the disease and its rapid global spread, it is highly likely that significant 
demands for diagnostic tests, therapies and vaccines to battle against the disease, will arise 
on a global scale, In addition, this virus contains genetic information which is extremely 
30 important and valuable for clinical and scientific research applications. 
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9. EQUIVALENTS 

Those skilled in the art will recognize, or be able to ascertain many equivalents to 
the specific embodiments of the invention described herein using no more than routine 
experimentation. Such equivalents are intended to be encompassed by the following claims. 

All publications, patents and patent applications mentioned in this specification are 
herein incorporated by reference into the specification to the same extent as if each 
individual publication, patent or patent application was specifically and individually 
indicated to be incorporated herein by reference. 

Citation or discussion of a reference herein shall not be construed as an admission 
that such is prior art to the present invention. 
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WHAT IS CLAIMED: 

1 . An isolated hSARS virus having China Center for Type Culture Collection Deposit 
Accession No. CCTCC-V200303. 

2. An isolated hSARS virus comprising a nucleic acid molecule comprising the 
nucleotide sequence of SEQ ID NO:l or a nucleotide sequence that hybridizes to SEQ ID 
NO:l under stringent condition. 

3 . An isolated hSARS virus comprising a nucleic acid molecule comprising the 
nucleotide sequence of SEQ ID NO: 1 1 or a nucleotide suqeunce that hybridizes to SEQ ED 
NO: 11 under stringent condition. 

4. An isolated hSARS virus comprising a nucleic acid molecule comprising the 
nucleotide sequence of SEQ ID NO: 13 or a nucleotide sequence that hybridizes to SEQ ID 
NO: 13 under stringent condition. 

5. The hSAARS virus of any one of claims 1-4 which is killed. 

6. The hSARS virus of any one of claims 1-4 which is attenuated. 

7. The attenuated hSARS virus of claim 6 whose infectivity is reduced. 

8. The attenuated hSARS virus of claim 7 whose infectivity is reduced by at least 5- 
fold, 10-fold, 25-fold, 50-fold, 100-fold, 250-fold , 500-fold, or 10,000-fold. 

9. The attenuated hSARS virus of claim 6 whose replication ability is reduced. 

10. The attenuated hSARS virus of claim 9 whose replication ability is reduced by at 
least 5-fold, 10-fold, 25-fold, 50-fold, 100-fold, 250-fold, 500-fold, 1,000-fold, or 10,000- 
fold. 

11. The attenuated hSARS virus of claim 6 whose protein synthesis ability is reduced: 

12. The attenuated hSARS virus of claim 1 1 whose protein synthesis ability is reduced 
by at least 5-fold, 10-fold, 25-fold, 50-fold, 100-fold, 250-fold, 500-fold, 1,000-fold, or 
10,000-fold. 
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13. The attenuated hSARS virus of claim 6 whose assembling ability is reduced. 

14. The attenuated hSARS virus of claim 13 whose assembling ability is reduced by at 
least 5-fold, 10-fold, 25-fold, 50-fold, 100-fold, 250-fold, 500-fold, 1,000-fold, or 10,000- 
fold. 

15. The attenuated hSARS virus of claim 6 whose cytopathic effect is reduced. 

16. The attenuated hSARS virus of claim 15 whose cytopathic effect is reduced by at 
least 5-fold, 10-fold, 25-fold, 50-fold, 100-fold, 250-fold, 500-fold, 1,000-fold, or 10,000- 
fold. 

17. An isolated nucleic acid molecule comprising a nucleotide sequence encoding the 
hSARS virus of any one of claims 1-4 or a complement thereof 

18. An isolated nucleic acid molecule which hybridizes under stringent conditions to the 
nucleic acid molecule of claim 17 or a complement thereof. 

19. An isolated nucleic acid molecule comprising the nucleotide sequence of SEQ ID 
NO: 1 or a complement thereof. 

20. An isolated nucleic acid molecule comprising a nucleotide sequence having at least 
100, 150, 200, 250, 300, 350, 400, 450, 500, 550 or 600 contiguous nucleotides of the 
nucleotide sequence of SEQ ID NO: 1, or a complement thereof. 

21. An isolated nucleic acid molecule comprising a nucleotide sequence that encodes the 
amino acid sequence of SEQ ID NO:2 or a complement of said nucleotide sequence. 

22. An isolated nucleic acid molecule comprising the nucleotide sequence of SEQ ID 
NO: 1 1 or a complement thereof. 

23 . An isolated nucleic acid molecule comprising a nucleotide sequence having at least 
45, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 
800, 850, 900, 950, 1,000, 1050, 1,100, 1,150 or 1,200 contiguous nucleotides of the 
nucleotide sequence of SEQ ID NO: 1 1, or a complement thereof. 
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24. An isolated nucleic acid molecule comprising the nucleotide sequence of SEQ ID 
NO: 13 or a complement thereof. 

25. An isolated nucleic acid molecule comprising a nucleotide sequence having at least 
5, 500, 550, 600, 650 or 700 contiguous nucleotides of the nucleotide sequence of SEQ ID 
NO: 13, or a complement thereof. 

26. An isolated nucleic acid molecule which hybridizes under stringent conditions to a 
nucleic acid molecule having the nucleotide sequence of SEQ ED NO: 1, 1 1, or 13, or a 
complement thereof, wherein the nucleic acid molecule encodes an amino acid sequence 
which has a biological activity exhibited by a polypeptide encoded by the nucleotide 
sequence of SEQ ID NO: 1, 11 or 13. 

27. The nucleic acid molecule of claim 17, wherein the molecule is RNA. 

28. The nucleic acid molecule of claim 18, wherein the molecule is RNA. 

29. The nucleic acid molecule of any one of claim 19-26, wherein the molecule is RNA. 

30. The nucleic acid molecule of claim 17, wherein the molecule is DNA. 

3 1 . The nucleic acid molecule of claim 18, wherein the molecule is DNA. 

32. The nucleic acid molecule of any one of claims 19-26, wherein the molecule is DNA. 

33. An isolated polypeptide encoded by the nucleic acid molecule of claim 17. 

34. An isolated polypeptide encoded by the nucleic acid molecule of claim 18. 

35. An isolated polypeptide encoded by the nucleic acid molecule of any one of claims 
19-26. 

36. An isolated polypeptide comprising the amino acid sequence of SEQ ID NO:2. 

37. An isolated polypeptide comprising the amino acid sequence having at least 5, 10, 
15, 20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 150 or 200 contiguous amino acid 
residues of the amino acid sequence of SEQ IDNO:2. 
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38. An isolated polypeptide comprising the amino acid sequence of SEQ ID NO: 12. 

39. An isolated polypeptide comprising an amino acid sequence having at least 5, 10, 15, 
20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350 or 400 contiguous 
amino acid residues of the amino acid sequence of SEQ ID NO: 12. 

40. An isolated polypeptide comprising the amino acid sequence of SEQ ID NO: 14. 

41. An isolated polypeptide comprising an amino acid sequence having at least 5, 10, 15, 
20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 150 or 200 contiguous amino acid residues of 
the amino acid sequence of SEQ IDNO:14. 

42. An isolated antibody or an antigen-binding fragment thereof which 
immunospecifically binds to the hSARS virus of Deposit Accession No: CCTCC-V200303. 

43. The isolated antibody of claim 42 or an antigen-binding fragment thereof which 
neutralizes an hSARS virus. 

44. An isolated antibody or an antigen-binding fragment thereof which 
immunospecifically binds to the hSARS virus of any one of claims 2-4. 

45. The isolated antibody of claim 44 or an antigen-binding fragment thereof which 
neutralizes an hSARS virus. 

46. An isolated antibody or an antigen-binding fragment thereof which 
immunospecifically binds to the polypeptide of claim 33. 

47. The isolated antibody of claim 46 or an antigen-binding fragment thereof which 
neutralizes an hSARS virus. 

48. An isolated antibody or an antigen-binding fragment thereof which 
immunospecifically binds to the polypeptide of claim 34. 

49. The isolated antibody of claim 48 or an antigen-binding fragment thereof which 
neutralizes an hSARS virus. 
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50. An isolated antibody or an antigen-binding fragment thereof which 
immuno specifically binds to the polypeptide of claim 35. 

5 1 . The isolated antibody of claim 50 or an antigen-binding fragment thereof which 
neutralizes an hSARS virus. 

52. An isolated antibody or an antigen-binding fragment thereof which 
immunospecifically binds to the polypeptide of any one of claims 36-41. 

53 . The isolated antibody of claim 52 or an antigen-binding fragment thereof which 
neutralizes an hS ARS virus. 

54. A method for detecting the presence of the hSARS virus of any one of claims 1-4 in 
a biological sample, said method comprising: 

(a) contacting the sample with a compound that selectively binds to said hS ARS 
virus; and 

(b) detecting whether the compound binds to said hSARS virus in the sample. 

55. The method of claim 54, wherein the biological sample is selected from the group 
consisting of cells, blood, serum, plasma, saliva, urine, stool, sputum, and nasopharyngeal 
aspirates. 

56. The method of claim 54, wherein the compound that binds to said virus is an 
antibody. 

57. The method of claim 54, wherein the compound that binds to said virus is a nucleic 
acid molecule comprising the nucleotide sequence of SEQ ID NO: 1 or a complement 
thereof 

58. The method of claim 54, wherein the compound that binds to said virus is a nucleic 
acid molecule comprising a nucleotide sequence having at least 5, 10, 15, 20, 25, 30, 35, 40, 
45, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550 or 600 contiguous 
nucleotides of the nucleotide sequence of SEQ ID NO: 1, or a complement thereof. 
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59. The method of claim 54, wherein the compound that binds to said virus is a nucleic 
acid molecule comprising the nucleotide sequence of SEQ ID NO: 1 1 or a complement 
thereof. 

60. The method claim 54, wherein the compound that binds to said virus is a nucleic 
acid molecule comprising a nucleotide sequence having at least 5, 10, 15, 20, 25, 30, 35, 40, 
45, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 
800, 850, 900, 950, 1,000, 1,050, 1,100, 1,150 or 1,200 contiguous nucleotides of the 
nucleotide sequence of SEQ ID NO: 1 1, or a complement thereof. 

61 . The method of claim 54, wherein the compound that binds to said virus is a nucleic 
acid molecule comprising the nucleotide sequence of SEQ ID NO: 13 or a complement 
thereof. 

62. The method of claim 54, wherein the compound that binds to said virus is a nucleic 
acid molecule comprising a nucleotide sequence having at least 5, 10, 15, 20, 25, 30, 35, 40, 
45, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650 or 700 
contiguous nucleotides of the nucleotide sequence of SEQ ID NO: 13, or a complement 
thereof. 

63. A method for detecting the presence of the polypeptide of claim 33 in a biological 
sample, said method comprising: 

(a) contacting the biological sample with a compound that selectively binds to 
said polypeptide; and 

(b) detecting whether the compound binds to said polypeptide in the sample. 

64. The method of claim 63, wherein the biological sample is selected from the group 
consisting of cells, blood, serum, plasma, saliva, urine, stool, sputum, and nasopharyngeal 
aspirates. 

65. The method of claim 63, wherein the compound that binds to said polypeptide is an 
antibody or an antigen-binding fragment thereof 
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66. A method for detecting the presence of the polypeptide of claim 34 in a biological 
sample, said method comprising: 

(a) contacting the biological sample with a compound that selectively binds to 
said polypeptide; and 

(b) detecting whether the compound binds to said polypeptide in the sample. 

67. The method of claim 66, wherein the biological sample is selected from the group 
consisting of cells, blood, serum, plasma, saliva, urine, stool, sputum, and nasopharyngeal 
aspirates. 

68. The method of claim 66, wherein the compound that binds to said polypeptide is an 
antibody or an antigen-binding fragment thereof 

69. A method for detecting the presence of polypeptide of claim 35 in a biological 
sample, said method comprising: 

(a) contacting the biological sample with a compound that selectively binds to 
said polypeptide; and 

(b) detecting whether the compound binds to said polypeptide in the sample. 

70. The method of claim 69, wherein the biological sample is selected from the group 
consisting of cells, blood, serum, plasma, saliva, urine, stool, sputum, and nasopharyngeal 
aspirates. 

71 . The method of claim 69, wherein the compound that binds to said polypeptide is an 
antibody or an antigen-binding fragment thereof 

72. A method for detecting the presence of the polypeptide of claims 36-41 in a 
biological sample, said method comprising: 

(a) contacting the biological sample with a compound that selectively binds to 
said polypeptide; and 

(b) detecting whether the compound binds to said polypeptide in the sample. 
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73. The method of claim 72, wherein the biological sample is selected from the group 
consisting of cells, blood, serum, plasma, saliva, urine, stool, sputum, and nasopharyngeal 
aspirates. 

74. The method of claim 72, wherein the compound that binds to said polypeptide is an 
antibody or an antigen-binding fragment thereof. 

75. A method for detecting the presence of a first nucleic acid molecule derived from 
the hSARS virus of claim 1 in a biological sample, said method comprising: 

(a) Contacting the biological sample with a compound that selectively binds to 
said first nucleic acid molecule; and 

(b) detecting whether the compound binds to said first nucleic acid molecule in 
the sample. 

76. The method of claim 75, wherein the compound that binds to said first nucleic acid 
molecule is a second nucleic acid molecule comprising the nucleotide sequence of SEQ ID 
NO: 1 or a complement thereof. 

77. The method of claim 75, wherein the second nucleic acid molecule comprises at 
least 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 
450, 500, 550 or 600 contiguous nucleotides of the nucleotide sequence of SEQ ID NO: 1, or 
a complement thereof. 

78. The method of claim 75, wherein the compound that binds to said first nucleic acid 
molecule is a second nucleic acid molecule comprising the nucleotide sequence of SEQ ID 
NO: 1 1 or a complement thereof. 

79. The method of claim 75, wherein the second nucleic acid molecule comprises at 
least 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 
450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1,000, 1,050, 1,100, 1,150 or 1,200 
contiguous nucleotides of the nucleotide sequence of SEQ ID NO: 1 1, or a complement 
thereof. 
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80. The method of claim 75, wherein the compound that binds to said first nucleic acid 
molecule is a second nucleic acid molecule comprising the nucleotide sequence of SEQ ID 
NO: 13 or a complement thereof 

8 1 . The method of claim 75, wherein the second nucleic acid molecule comprises at 
least 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 
450, 500, 550, 600, 650 or 700 contiguous nucleotides of the nucleotide sequence of SEQ 
ID NO: 13, or a complement thereof 

82. A method for detecting the presence of a first nucleic acid molecule derived from 
the hSARS virus of claim 2-4 in a biological sample, said method comprising: 

(a) Contacting the biological sample with a compound that selectively binds to 
said first nucleic acid molecule; and 

(b) detecting whether the compound binds to said first nucleic acid molecule in 
the sample. 

83 . The method of claim 82, wherein the compound that binds to said first nucleic acid 
molecule is a second nucleic acid molecule comprising the nucleotide sequence of SEQ ID 
NO: 1 or a complement thereof 

84. The method of claim 82, wherein the second nucleic acid molecule comprises at 
least 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 
450, 500, 550 or 600 contiguous nucleotides of the nucleotide sequence of SEQ ID NO:l, or 
a complement thereof. 

85. The method of claim 82, wherein the compound that binds to said first nucleic acid 
molecule is a second nucleic acid molecule comprising the nucleotide sequence of SEQ ID 
NO: 1 1 or a complement thereof 

86. The method of claim 82, wherein the second nucleic acid molecule comprises at 
least 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 
450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1,000, 1,050, 1,100, 1,150 or 1,200 
contiguous nucleotides of the nucleotide sequence of SEQ ID NO:l 1, or a complement 
thereof. 
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87. The method of claim 82, wherein the compound that binds to said first nucleic acid 
molecule is a second nucleic acid molecule comprising the nucleotide sequence of SEQ ID 
NO: 13 or a complement thereof 

88. The method of claim 82, wherein the second nucleic acid molecule comprises at 
least 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 
450, 500, 550, 600, 650 or 700 contiguous nucleotides of the nucleotide sequence of SEQ 
ID NO: 13, or a complement thereof 

89. A host cell infected with the hSARS virus of Deposit Accession No. CCTCC- 



V200303. 


90. 


The host cell of claim 89 which is a primate cell. 


91. 


The host cell of claim 90 which is a FRhK-4 fetal rhesus monkey kidney cell. 


92. 


A host cell infected with the hSARS virus of any one of claims 2-4. 


93. 


The host cell of claim 92 which is a primate cell. 


94. 


The host cell of claim 93 which is a FRhK-4 fetal rhesus monkey kidney cell. 


95. 


A method of detecting a biological sample the presence of an antibody that 



immunospecifically binds hSARS virus, said method comprising: 

(a) contacting the biological sample with the host cell of claim 89; and 

(b) detecting the antibody bound to the cell. 

96. A method of detecting a biological sample the presence of an antibody that 
immunospecifically binds hSARS virus, said method comprising: 

(a) contacting the biological sample with the host cell of claim 92; and 

(b) detecting the antibody bound to the cell. 

97. An immunogenic formulation comprising an immunogenically effective amount of 
the hSARS virus of claim 5, and a pharmaceutically acceptable carrier. 
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98. An immunogenic formulation comprising an immunogenically effective amount of 
the hS ARS virus of claim 6, and a pharmaceutically acceptable carrier. 

99. An immunogenic formulation comprising an immunogenically effective amount of a 
protein extract of the hSARS virus of claim 5 or a subunit thereof, and a pharmaceutically 
acceptable carrier. 

100. An immunogenic formulation comprising an immunogenically effective amount of a 
protein extract of the hSARS virus of claim 6 or a subunit thereof, and a pharmaceutically 
acceptable carrier. 

101. An immunogenic formulation comprising an immunogenically effective amount of a 
nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO: 1 or a 
complement thereof, and a pharmaceutically acceptable carrier. 

102. An immunogenic formulation comprising an immunogenically effective amount of a 
nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO: 1 1 or a 
complement thereof, and a pharmaceutically acceptable carrier. 

103. An immunogenic formulation comprising an immunogenically effective amount of a 
nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO: 13 or a 
complement thereof, and a pharmaceutically acceptable carrier. 

104. An immunogenic formulation comprising an immunogenically effective amount of 
the polypeptide of claim 33. 

105. An immunogenic formulation comprising an immunogenically effective amount of 
the polypeptide of claim 34. 

106. An immunogenic formulation comprising an immunogenically effective amount of 
polypeptide of claim 35. 

107. An immunogenic formulation comprising an immunogenically effective amount of 
the polypeptide of claim 36-41. 
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108. A vaccine formulation comprising a therapeutically or prophylactically effective 
amount of the hSARS virus of claim 5, and a pharmaceutically acceptable carrier. 

109. A vaccine formulation comprising a therapeutically or prophylactically effective 
amount of the hSARS virus of claim 6, and a pharmaceutically acceptable carrier. 

110. A vaccine formulation comprising a therapeutically or prophylactically effective 
amount of a protein extract of the hS ARS virus of claim 5 or a subunit thereof, and a 
pharmaceutically acceptable carrier. 

111. A vaccine formulation comprising a therapeutically or prophylactically effective 
amount of a protein extract of the hS ARS virus of claim 6 or a subunit thereof, and a 
pharmaceutically acceptable carrier. 

112. A vaccine formulation comprising an therapeutically or prophylactically effective 
amount of a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:l or 
a complement thereof; and a pharmaceutically acceptable carrier. 

113. A vaccine formulation comprising an therapeutically or prophylactically effective 
amount of a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO: 1 1 or 
a complement thereof, and a pharmaceutically acceptable carrier. 

114. A vaccine formulation comprising an therapeutically or prophylactically effective 
amount of a nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO: 13 or 
a complement thereof, and a pharmaceutically acceptable carrier. 

115. A pharmaceutical composition comprising a prophylactically or therapeutically 
effective amount of an anti-hS ARS agent and a pharmaceutically acceptable carrier. 

116. The pharmaceutical composition of claim 115, wherein the anti-hSARS agent is an 
antibody or an antigen-binding fragment thereof which immunospecifically binds to the 
hSARS virus of Deposit Accession No. CCTCC-V200303, or polypeptides or protein 
derived therefrom. 
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1 17. The pharmaceutical composition of claim 115, wherein the anti-hS ARS agent is a 
nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO: 1, or a fragment 
thereof 

118. The pharmaceutical composition of claim 115, wherein the anti-hS ARS agent is a 
nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO:l 1 or 13, or a 
fragment thereof. 

1 19. The pharmaceutical composition of claim 115, wherein the anti-hSARS agent is a 
polypeptide encoded by a nucleic acid molecule comprising the nucleotide sequence of SEQ 
ID NO: 1 or a fragment thereof having a biological activity of said polypeptide. 

120. The pharmaceutical composition of claim 115, wherein the anti-hSARS agent is a 
polypeptide encoded by a nucleic acid molecule comprising the nucleotide sequence of SEQ 
ID NO : 1 1 or 13, or a fragment thereof having a biological activity of said polypeptide. 

121 . A kit comprising a container containing the immunogenic formulation of claim 97. 

122. A kit comprising a container containing the immunogenic formulation of claim 98. 

123. A kit comprising a container containing the immunogenic formulation of claim 99. 

124. A kit comprising a container containing the immunogenic formulation of claim 100. 

125. A kit comprising a container containing the immunogenic formulation of any one of 
claims 101-103. 

126. A kit comprising a container containing the immunogenic formulation of claim 104. 

127. A kit comprising a container containing the immunogenic formulation of claim 105. 

128. A kit comprising a container containing the immunogenic formulation of claim 106. 

129. A kit comprising a container containing the immunogenic formulation of claim 107. 

130. A kit comprising a container containing the vaccine formulation of claim 108. 

131. A kit comprising a container containing the vaccine formulation of claim 109. 
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132. A kit comprising a container containing the vaccine formulation of claim 110. 

133 . A kit comprising a container containing the vaccine formulation of claim 111. 

134. A kit comprising a container containing the vaccine formulation of any one of 
claims 112-114. 

135. A kit comprising a container containing the pharmaceutical composition of claim 
115. 

136. A method for identifying a subject infected with the hSARS virus of claim 1, 
comprising: 

(a) obtaining total RNA from a biological sample obtained from the subject 

(b) reverse transcribing the total RNA to obtain cDNA; and 

(c) amplifying the cDNA using a set of primers derived from a nucleotide 
sequence of the hS ARS virus. 

137. The method of claim 136, wherein the set of primers are derived from the nucleotide 
sequence of the genome of the hSARS virus of Deposit Accession No. CCTCC-V200303. 

138. The method of claim 136, wherein the set of primers are derived from the nucleotide 
sequence of SEQ ID NO: 1, 1 1 or 13, or a complement thereof 

139. The method of claim 136, wherein the set of primers have the nucleotide sequence 
of SEQ ID NOS:3 and 4, respectively. 

140. A method for identifying a subject infected with the hSARS virus of any one of 
claims 2-4, comprising: 

(a) obtaining total RNA from a biological sample obtained from the subject 

(b) reverse transcribing the total RNA to obtain cDNA; and 

(c) amplifying the cDNA using a set of primers derived from a nucleotide 
sequence of the hSARS virus. 
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141. The method of claim 140, wherein the set of primers are derived from the nucleotide 
sequence of the genome of the hSARS virus of Deposit Accession No. CCTCC-V200303. 

142. The method of claim 140, wherein the set of primers are derived from the nucleotide 
sequence of SEQ ID NO: 1, 1 1 or 13, or a complement thereof. 

143. The method of claim 140, wherein the set of primers have the nucleotide sequence 
of SEQ ID NOS:3 and 4, respectively. 

144. An isolated hSARS virus having the nucleotide sequence of SEQ ID NO: 15 or a 
nucleotide sequence that hybridizes to SEQ ID NO: 15 under stringent condition. 

145. An isolated nucleic acid molecule comprising a nucleotide sequence of SEQ ID NO: 
1 5 or a complement thereof. 

146. An isolated nucleic acid molecule comprising a nucleotide sequence having at least 
5, 10, 15, 20, 25, 30, 35, 40, 45, 100, 150, 200, 300, 350, 400, 450, 500, 550, 600, 650, 700, 
750, 800, 850, 900, 950, 1,000, 1,050, 1,100, 1,150, 1,200, 2,000, 3,000, 4,000, 5,000, 6,000, 
7,000, 8,000, 9,000, 10,000, 11,000, 12,000, 13,000, 14,000, 15,000, 16,000, 17,000, 18,000, 
19,000, 20,000, 21,000, 22,000, 23,000, 24,000, 25,000, 26,000, 27,000, 28,000, 29,000 
contiguous nucleotides of the nucleotide sequence of SEQ ID NO: 15, or a complement 
thereof 

147. An isolated nucleic acid molecule comprising a nucleotide sequence which 
hybridizes under stringent conditions to the nucleic acid molecule of SEQ ED NO: 15 or a 
complement thereof. 

148. An isolated polypeptide encoded by the nucleic acid molecule of claim 145 or a 
fragment of said nucleic acid molecule. 

149. An isolated antibody or an antigen-binding fragment thereof which 
immunospecifically binds to the polypeptide of claim 148. 

150. The isolated antibody of claim 149 or an antigen-binding fragment thereof which 
neutralizes an hS ARS virus. 
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151. A method for detecting the presence of the hS ARS virus of claim 144 in a biological 
sample, said method comprising: 

(a) contacting the sample with a compound that selectively binds to said hSARS 
virus; and 

(b) detecting whether the compound binds to said hSARS virus in the sample. 

1 52. The method of claim 151, wherein the biological sample is selected from the group 
consisting of cells, blood, serum, plasma, saliva, urine, stool, sputum, and nasopharyngeal 
aspirates. 

1 53 . The method of claim 151, wherein the compound that binds to said virus is an 
antibody. 

154. The method of claim 151, wherein the compound that binds to said virus is a nucleic 
acid molecule comprising the nucleotide sequence of SEQ ID NO: 1, 1 1 or 13, or a 
complement thereof. 

155. A method for detecting the presence of the polypeptide of claim 148 in a biological 
sample, said method comprising: 

(a) contacting the biological sample with a compound that selectively binds to 
said polypeptide; and 

(b) detecting whether the compound binds to said polypeptide in the sample. 

156. The method of claim 155, wherein the biological sample is selected from the group 
consisting of cells, blood, serum, plasma, saliva, urine, stool, sputum, and nasopharyngeal 
aspirates. 

157. The method of claim 155, wherein the compound that binds to said polypeptide is an 
antibody or an antigen-binding fragment thereof. 

158. A method for detecting the presence of a nucleic acid molecule comprising the 
nucleotide sequence of SEQ ID NO: 15 or a fragment thereof in a biological sample, said 
method comprising: 
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(a) contacting the biological sample with a compound that selectively binds to 
said nucleic acid molecule; and 

(b) detecting whether the compound binds to said nucleic acid molecule in the 
sample. 

159. The method of claim 158, wherein the biological sample is selected from the group 
consisting of cells, blood, serum, plasma, saliva, urine, stool, sputum, and nasopharyngeal 
aspirates. 

160. A host cell infected with the hSARS virus of claim 144. 

161. A vaccine formulation comprising a therapeutically or prophylactically effective 
amount of the hSARS virus of claim 144 and a pharmaceutically acceptable carrier, wherein 
the hSARS virus is killed. 

162. A vaccine formulation comprising a therapeutically or prophylactically effective 
amount of the hSARS virus of claim 144 and a pharmaceutically acceptable carrier, wherein 
the hSARS virus is attenuated. 

163. A vaccine formulation comprising a therapeutically or prophylactically effective 
amount of a protein extract of the hSARS virus of claim 144 and a pharmaceutically 
acceptable carrier. 

164. A vaccine formulation comprising a therapeutically or prophylactically effective 
amount of the polypeptide of claim 148, and a pharmaceutically acceptable carrier. 

165. A vaccine formulation comprising a therapeutically or prophylactically effective 
amount of a nucleic acid molecule comprising a nucleotide sequence of SEQ ID NO: 15, a 
complement thereof or a fragment thereof, and a pharmaceutically acceptable carrier. 

166. A method for identifying a subject infected with the hSARS virus of claim 144, 
comprising: 

(a) obtaining total RNA from a biological sample obtained from the subject 

(b) reverse transcribing the total RNA to obtain cDNA; and 
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(c) amplifying the cDNA using a set of primers derived from a nucleotide 
sequence of the hSARS virus. 

167. The method of claim 136 or 166, wherein the set of primers are derived from the 
nucleotide sequence of SEQ ID NO: 15, or a complement thereof. 
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t aaa tgt agt aga ate ata cct gcg cgt gcg cgc gta gag tgt ttt gat 4 9 
Lys Cys Ser Arg lie He Pro Ala Arg Ala Arg Val Glu Cys Phe Asp 
1 5 10 15 

aaa ttc aaa gtg aat tea aca eta gaa cag tat gtt ttc tgc act gta 97 
Lys Phe Lys Val Asn Ser Thr Leu Glu Gin Tyr Val Phe Cys Thr Val 
20 25 30 

aat gca ttg cca gaa aca act get gac att gta gtc ttt gat gaa ate 145 
Asn. Ala Leu Pro Glu Thr Thr Ala Asp He Val Val Phe Asp Glu He 
35 40 45 

tot atg get act aat tat gac ttg agt gtt gtc aat get aga ctt cgt 193 
Ser Met Ala Thr Asn Tyr Asp Leu Ser Val Val Asn Ala Arg Leu Arg 
50 55 60 

gca aaa cac tac gtc tat att ggc gat cct get caa tta cca gec ccc 241 
Ala Lys His Tyr Val Tyr He Gly Asp Pro Ala Gin Leu Pro Ala Pro 
65 70 75 80 

cgc aca ttg ctg act aaa ggc aca eta gaa cca gaa tat ttt aat tea 2 89 
Arg Thr Leu Leu Thr Lys Gly Thr Leu Glu Pro Glu Tyr Phe Asn Ser 
85 90 95 

gtg tgc aga ctt atg aaa aca ata ggt cca gac atg ttc ctt gga act 337 
Val Cys Arg Leu Met Lys Thr He Gly Pro Asp Met Phe Leu Gly Thr 
100 105 110 

tgt cgc cgt tgt cct get gaa att gtt gac act gtg agt get tta gtt 3 85 
Cys Arg Arg Cys Pro Ala Glu He Val Asp Thr Val Ser Ala Leu Val 
115 120 125 

tat gac aat aag eta aaa gca cac aag gag aag tea get caa tgc ttc 4 33 
Tyr Asp Asn Lys Leu Lys Ala His Lys Glu Lys Ser Ala Gin Cys Phe 
13 0 135 14 0 

aaa atg ttc tac aaa ggt gtt att aca cat gat gtt tea tct gca ate 4 81 
Lys Met Phe Tyr Lys Gly Val He Thr His Asp Val Ser Ser Ala He 
145 150 155 160 

aac aga cct caa ata ggc gtt gta aga gaa ttt ctt aca cgc aat cct 529 
Asn Arg Pro Gin He Gly Val Val Arg Glu Phe Leu Thr Arg Asn Pro 

165 170 175 

get tgg aga aaa get gtt ttt ate tea cct tat aat tea cag aac get 577 
Ala Trp Arg Lys Ala Val Phe He Ser Pro Tyr Asn Ser Gin Asn Ala 
180 185 190 

gta get tea aaa ate tta gga ttg cct acg cag act gtt gat tea tea 625 
Val Ala Ser Lys He Leu Gly Leu Pro Thr Gin Thr Val Asp Ser Ser 
195 200 205 

cag ggt tct gaa tat gac tat gtc ata ttc aca caa act act gaa aca 673 
Gin Gly Ser Glu Tyr Asp Tyr Val He Phe Thr Gin Thr Thr Glu Thr 
210 215 220 
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gca cac tct tgt aat gtc aac cgc ttc aat gtg get ate aca agg gca 721 
Ala His Ser Cys Asn Val Asn Arg Phe Asn Val Ala He Thr Arg Ala 
225 230 235 240 

aaa att ggc att ttg tgc ata atg tct gat aga gat ctt tat gac aaa 769 
Lys He Gly He Leu Cys He Met Ser Asp Arg Asp Leu Tyr Asp Lys 
245 250 255 

ctg caa ttt aca agt eta gaa ata cca cgt cgc aat gtg get aca tta 817 
Leu Gin Phe Thr Ser Leu Glu He Pro Arg Arg Asn Val Ala Thr Leu 
260 265 270 

caa gca gaa aat gta act gga ctt ttt aag gac tgt agt aag ate att 865 
Gin Ala Glu Asn Val Thr Gly Leu Phe Lys Asp Cys Ser Lys He He 
275 280 285 

act ggt ctt cat cot aca cag gca cct aca cac etc age gtt gat ata 913 
Thr Gly Leu His Pro Thr Gin Ala Pro Thr His Leu Ser Val Asp He 
290 295 300 

aaa ttc aag act gaa gga tta tgt gtt gac ata cca ggc ata cca aag 961 
Lys Phe Lys Thr Glu Gly Leu Cys Val Asp He Pro Gly He Pro Lys 
305 310 315 320 

gac atg acc tac cgt aga etc ate tct atg atg ggt ttc aaa atg aat 1009 
Asp Met Thr Tyr Arg Arg Leu He Ser Met Met Gly Phe Lys Met Asn 
325 330 335 

tac caa gtc aat ggt tac cct aat atg ttt ate acc cgc gaa gaa get 1057 
Tyr Gin Val Asn Gly Tyr Pro Asn Met Phe He Thr Arg Glu Glu Ala 
340 345 350 

att cgt cac gtt cgt gcg tgg att ggc ttt gat gta gag ggc tgt cat 1105 
lie Arg His Val Arg Ala Trp He Gly Phe Asp Val Glu Gly Cys His 
355 360 365 

gca act aga gat get gtg ggt act aac eta cct etc cag eta gga ttt 1153 
Ala Thr Arg Asp Ala Val Gly Thr Asn Leu Pro Leu Gin Leu Gly Phe 
370 375 380 

tct. aca ggt gtt aac tta gta get gta ccg act ggt tat gtt gac act 12 01 
Ser Thr Gly Val Asn Leu Val Ala Val Pro Thr Gly Tyr Val Asp Thr 
385 390 395 400 

gaa. aat aac eta 1213 

Glu Asn Asn Leu 
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c aga acc atg cct aac atg ctt agg ata atg gcc tct ctt gtt ctt get 4 9 
Arg Thr Met Pro Asn Met Leu Arg He Met Ala Ser Leu Val Leu Ala 
15 10 15 

cgc aaa cat aac act tgc tgt aac tta tea cac cgt ttc tac agg tta 97 
Arg Lys His Asn Thr Cys Cys Asn Leu Ser His Arg Phe Tyr Arg Leu 
20 25 ~ 30 

get aac gag tgt gcg caa gta tta agt gag atg gtc atg tgt ggc ggc 14 5 
Ala Asn Glu Cys Ala Gin Val Leu Ser Glu Met Val Met Cys Gly Gly 
35 4 0 45 

tea eta tat gtt aaa cca ggt gga aca tea tec ggt gat get aca act 193 
Ser Leu Tyr Val Lys Pro Gly Gly Thr Ser Ser Gly Asp Ala Thr Thr 
50 55 60 

get tat get aat agt gtc ttt aac att tgt caa get gtt aca gcc aat 241 
Ala Tyr Ala Asn Ser Val Phe Asn He Cys Gin Ala Val Thr Ala Asn 
65 70 75 80 

gta aat gca ctt ctt tea act gat ggt aat aag ata get gac aag tat 289 
Val Asn Ala Leu Leu Ser Thr Asp Gly Asn Lys He Ala Asp Lys Tyr 
85 90 * 95 

gtc cgc aat eta caa cac agg etc tat gag tgt etc tat aga aat agg 337 
Val Arg Asn Leu Gin His Arg Leu Tyr Glu Cys Leu Tyr Arg Asn Arg 
100 105 110 

gat gtt gat cat gaa ttc gtg gat gag ttt tac get tac ctg cgt aaa 385 
Asp Val Asp His Glu Phe Val Asp Glu Phe Tyr Ala Tyr Leu Arg Lys 
115 120 125 

cat ttc tec atg atg att ctt tct gat gat gcc gtt gtg tgc tat aac 433 
His Phe Ser Met Met He Leu Ser Asp Asp Ala Val Val Cys Tyr Asn 
130 135 140 

agt aac tat gcg get caa ggt tta gta get age att aag aac ttt aag 481 
Ser Asn Tyr Ala Ala Gin Gly Leu Val Ala Ser He Lys Asn Phe Lys 
145 150 155 160 

gca gtt ctt tat tat caa aat aat gtg ttc atg tct gag gca aaa tgt 529 
Ala Val Leu Tyr Tyr Gin Asn Asn Val Phe Met Ser Glu Ala Lys Cys 
165 170 S 175 

tgg act gag act gac ctt act aaa gga cct cac gaa ttt tgc tea cag 577 
Trp Thr Glu Thr Asp Leu Thr Lys Gly Pro His Glu Phe Cys Ser Gin 
180 185 190 

cat aca atg eta gtt aaa caa gga gat gat tac gtg tac ctg cct tac 625 
His Thr Met Leu Val Lys Gin Gly Asp Asp Tyr Val Tyr Leu Pro Tyr 
195 * 200 205 

cca gat cca tea aga ata tta ggc gca ggc tgt ttt gtc gat gat att 673 
Pro Asp Pro Ser Arg He Leu Gly Ala Gly Cys Phe Val Asp Asp He 
210 215 220 

gtc aaa cag atg gta cac tta tga ttg aaa ggt tec gtg tea ctg get 721 
Val Lys Gin Met Val His Leu 
225 230 

att gat gc 729 



FIG. 9 



WO 2004/085633 



PCT/CN2004/000248 



10/90 

1 atafctaggtt fcttacctacc caggaaaagc caaccaacct cgatctcttg tagatctgtt 
61 ctctaaacga actttaaaat ctgtgtagct gtcgctcggc tgcatgccta gtgcacctac 
121 gcagtataaa caataataaa ttttactgtc gttgacaaga aacgagtaac tcgtccctct 
181 tctgcagact gcttacggtt tcgtccgtgt tgcagtcgat catcagcata cctaggtttc 
241 gfcccgggtgt gaccgaaagg taagatggag agccttgttc ttggtgtcaa cgagaaaaca 
301 cacgtccaac tcagtttgcc tgtccttcag gttagagacg tgctagtgcg tggcttcggg 
361 gactctgtgg aagaggccct atcggaggca cgtgaacacc tcaaaaatgg cacttgtggt 
421 ctagtagagc tggaaaaagg cgtactgccc cagcttgaac agccctatgt gttcattaaa 
481 cgttctgatg ccttaagcac caatcacggc caoaaggtcg ttgagctggt tgcagaaatg 
541 gacggcattc agtacggtcg tagcggtata acactgggag tactcgtgcc acatgtgggc 
601 gaaaccccaa ttgcataccg caatgttctt cttcgtaaga acggtaataa gggagccggt 
661 ggtcatagct atggcatcga tctaaagtct tatgacttag gtgacgagct tggcactgat 
721 cccattgaag attatgaaca aaactggaac actaagcatg gcagtggtgc actccgtgaa 
781 ctcactcgtg agctcaatgg aggtgcagtc actcgctatg tcgacaacaa tttctgtggc 
841 ccagatgggt accctcttga ttgcatcaaa gattttctcg cacgcgcggg caagtcaatg 
901 tgcactcttt ccgaacaact tgattacatc gagtcgaaga gaggtgtcta ctgctgccgt 
961 gaccatgagc atgaaattgc ctggttcact gagcgctctg ataagagcta cgagcaccag 
1021 acacccttcg aaattaagag tgccaagaaa tttgacactt tcaaagggga atgcccaaag 
1081 tttgtgtttc ctcttaactc aaaagtcaaa gtcattcaac cacgtgttga aaagaaaaag 
1141 actgagggtt tcatggggcg tatacgctct gtgtaccctg ttgcatctcc acaggagtgt 
1201 aacaatatgc acttgtctac cttgatgaaa tgtaatcatt gcgatgaagt ttcatggcag 
12 61 acgtgcgact ttctgaaagc cacttgtgaa cattgtggca ctgaaaattt agttattgaa 
1321 ggacctacta catgtgggta cctacctact aatgctgtag tgaaaatgcc atgtcctgcc 
1381 tgtcaagacc cagagattgg acctgagcat agtgttgcag attatcacaa ccactcaaac 
1441 attgaaactc gactccgcaa gggaggtagg actagatgtt ttggaggctg tgtgtttgcc 
1501 tatgttggct gctataataa gcgtgcctac tgggttcctc gtgctagtgc tgatattggc 
1561 tcaggccata ctggcattac tggtgacaat gtggagacct tgaatgagga tctccttgag 
1621 atactgagtc gtgaacgtgt taacattaac attgttggcg attttcattt gaatgaagag 
1681 gttgccatca ttttggcatc tttctctgct tctacaagtg cctttattga cactataaag 
1741 agtcttgatt acaagtcttt caaaaccatt gttgagtcct gcggtaacta taaagttacc 
18 01 aagggaaagc ccgtaaaagg tgcttggaac attggacaac agagatcagt tttaacacca 

18 61 ctgtgtggtt ttccctcaca ggctgctggt gttatcagat caatttttgc gcgcacactt 
1921 gatgcagcaa accactcaat tcctgatttg caaagagcag ctgtcaccat acttgatggt 

19 81 atttctgaac agtcattacg tcttgtcgac gccatggttt atacttcaga cctgctcacc 
2041 aacagtgtca ttattatggc atatgtaact ggtggtcttg tacaacagac ttctcagtgg 
2101 ttgtctaatc ttttgggcac tactgttgaa aaactcaggc ctatctttga atggattgag 
2161 gcgaaactta gtgcaggagt tgaatttctc aaggatgctt gggagattct caaatttctc 
2221 attacaggtg tttttgacat cgtcaagggt caaatacagg ttgcttcaga taacatcaag 
22 81 gattgtgtaa aatgcttcat tgatgttgtt aacaaggcac tcgaaatgtg cattgatcaa 
2341 gtcactatcg ctggcgcaaa gttgcgatca ctcaacttag gtgaagtctt catcgctcaa 
24 01 agcaagggac tttaccgtca gtgtatacgt ggcaaggagc agctgcaact actcatgcct 
24 61 cttaaggcac caaaagaagt aacctttctt gaaggtgatt cacatgacac agtacttacc 
2521 tctgaggagg ttgttctcaa gaacggtgaa ctcgaagcac tcgagacgcc cgttgatagc 
2581 ttcacaaatg gagctatcgt cggcacacca gtctgtgtaa atggcctcat gctcttagag 
2641 attaaggaca aagaacaata ctgcgcattg tctcctggtt tactggctac aaacaatgtc 
2701 tttcgcttaa aagggggtgc accaattaaa ggtgtaacct ttggagaaga tactgtttgg 

27 61 gaagttcaag gttacaagaa tgtgagaatc acatttgagc ttgatgaacg tgttgacaaa 
2821 gtgcttaatg aaaagtgctc tgtctacact gttgaatccg gtaccgaagt tactgagttt 

28 81 gcatgtgttg tagcagaggc tgttgtgaag actttacaac cagtttctga tctccttacc 
2941 aacatgggta ttgatcttga tgagtggagt gtagctacat tctacttatt tgatgatgct 
3001 ggtgaagaaa acttttcatc acgtatgtat tgttcctttt accctccaga tgaggaagaa 
3061 gaggacgatg cagagtgtga ggaagaagaa attgatgaaa cctgtgaaca tgagtacggt 
3121 acagaggatg attatcaagg tctccctctg gaatttggtg cctcagctga aacagttcga 
3181 gttgaggaag aagaagagga agactggctg gatgatacta ctgagcaatc agagattgag 
3241 ccagaaccag aacctacacc tgaagaacca gttaatcagt ttacfcggtta tttaaaacfct 
3301 actgacaatg ttgccattaa atgtgttgac atcgttaagg aggcacaaag tgctaatcct 
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3361 atggtgattg taaatgctgc taacatacac ctgaaacatg gtggtggtgt agcaggtgca 
3421 ctcaacaagg caaccaatgg tgccatgcaa aaggagagtg atgattacat taagctaaat 
3481 ggccctctta cagtaggagg gtcttgtttg ctttctggac ataatcttgc taagaagtgt 
3541 ctgcatgttg ttggacctaa cctaaatgca ggtgaggaca tccagcttct taaggcagca 
3 601 tatgaaaatt tcaattcaca ggacatctta cttgcaccat tgttgtcagc aggcatattt 

3 661 ggtgctaaac cacttcagtc tttacaagtg tgcgtgcaga cggttcgtac acaggtttat 
3721 attgcagtca atgacaaagc tctttatgag caggttgtca tggattatct tgataacctg 
37 81 aagcctagag tggaagcacc taaacaagag gagccaccaa acacagaaga ttccaaaact 
3841 gaggagaaat ctgtcgtaca gaagcctgtc gatgtgaagc caaaaattaa ggcctgcatt 
3901 gatgaggtta ccacaacact ggaagaaact aagtttctta ccaataagtt actcttgttt 
3961 gctgatatca atggtaagct ttaccatgat tctcagaaca tgcttagagg tgaagatatg 
4021 tctttccttg agaaggatgc accttacatg gtaggtgatg ttatcactag tggtgatatc 
4081 acttgtgttg taataccctc caaaaaggct ggtggcacta ctgagatgct ctcaagagct 
4141 ttgaagaaag tgccagttga tgagtatata accacgtacc ctggacaagg atgtgctggt 
4201 tatacacttg aggaagctaa gactgctctt aagaaatgca aatctgcatt ttatgtacta 
4261 ccttcagaag cacctaatgc taaggaagag attctaggaa ctgtatcctg gaatttgaga 
4321 gaaatgcttg ctcatgctga agagacaaga aaattaatgc ctatatgcat ggatgttaga 

4 381 gccataatgg caaccatcca acgtaagtat aaaggaatta aaattcaaga gggcatcgtt 
4 441 gactatggtg tccgattctt cttttatact agtaaagagc ctgtagcttc tattattacg 
4 501 aagctgaact ctctaaatga gccgcttgtc acaatgccaa ttggttatgt gacacatggt 
4 561 tttaatcttg aagaggctgc gcgctgtatg cgttctctta aagctcctgc cgtagtgtca 
4 621 gtatcatcac cagatgctgt tactacatat aatggatacc tcacttcgtc atcaaagaca 
4 681 tctgaggagc actttgtaga aacagtttct ttggctggct cttacagaga ttggtcctat 
4 7 41 tcaggacagc gtacagagtt aggtgttgaa tttcttaagc gtggtgacaa aattgtgtac 
4 801 cacactctgg agagccccgt cgagtttcat cttgacggtg aggttctttc acttgacaaa 
4 8 61 ctaaagagtc tcttatccct gcgggaggtt aagactataa aagtgttcac aactgtggac 
4 921 aacactaatc tccacacaca gcttgtggat atgtctatga catatggaca gcagtttggt 
4 981 ccaacatact tggatggtgc tgatgttaca aaaattaaac ctcatgtaaa tcatgagggt 
5041 aagactttct ttgtactacc tagtgatgac acactacgta gtgaagcttt cgagtactac 
5101 catactcttg atgagagttt tcttggtagg tacatgtctg ctttaaacca cacaaagaaa 
5161 tggaaatttc ctcaagttgg tggtttaact tcaattaaat gggctgataa caattgttat 
5221 ttgtctagtg ttttattagc acttcaacag cttgaagtca aattcaatgc accagcactt 
5281 caagaggctt attatagagc ccgtgctggt gatgctgcta acttttgtgc actcatactc 
5341 gcttacagta ataaaactgt tggcgagctt ggtgatgtca gagaaactat gacccatctt 
5401 ctacagcatg ctaatttgga atctgcaaag cgagttctta atgtggtgtg taaacattgt 
54 61 ggtcagaaaa ctactacctt aacgggtgta gaagctgtga tgtatatggg tactctatct 
5521 tatgataatc ttaagacagg tgtttccatt ccatgtgtgt gtggtcgtga tgctacacaa 
5581 tatctagtac aacaagagtc ttcttttgtt atgatgtctg caccacctgc tgagtataaa 
5641 ttacagcaag gtacattctt atgtgcgaat gagtacactg gtaactatca gtgtggtcat 
57 01 tacactcata taactgctaa ggagaccctc tatcgtattg acggagctca ccttacaaag 
57 61 atgtcagagt acaaaggacc agtgactgat gttttctaca aggaaacatc ttacactaca 
5821 accatcaagc ctgtgtcgta taaactcgat ggagttactt acacagagat tgaaccaaaa 
5881 ttggatgggt attataaaaa ggataatgct tactatacag agcagcctat agaccttgta 
5941 ccaactcaac cattaccaaa tgcgagtttt gataatttca aactcacatg ttctaacaca 
60 01 aaatttgctg atgatttaaa tcaaatgaca ggcttcacaa agccagcttc acgagagcta 
6061 tctgtcacat tcttcccaga cttgaatggc gatgtagtgg ctattgacta tagacactat 
6121 tcagcgagtt tcaagaaagg tgctaaatta ctgcataagc caattgtttg gcacattaac 
6181 caggctacaa ccaagacaac gttcaaacca aacacttggt gtttacgttg tctttggagt 
62 41 acaaagccag tagatacttc aaattcattt gaagttctgg cagtagaaga cacacaagga 
6301 atggacaatc ttgcttgtga aaghcaacaa cccacctctg aagaagtagt ggaaaatcct 
6361 accatacaga aggaagtcat agagtgtgac gtgaaaacta ccgaagttgt aggcaatgtc 
6421 atacttaaac catcagatga aggtgttaaa gtaacacaag agttaggtca tgaggafcctt 
6481 atggctgctt atgtggaaaa cacaagcatt accattaaga aacctaatga gctttcacta 
6541 gccttaggtt taaaaacaat tgccactcat ggtattgctg caattaatag tgttccttgg 
6601 agtaaaattt tggcttatgt caaaccattc ttaggacaag cagcaattac aacatcaaat 
6661 tgcgctaaga gattagcaca acgtgtgttt aacaattata tgccttatgt gtttacatta 
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6721 ttgttccaat tgtgtacttt tactaaaagt accaattcta gaattagagc ttcactacct 
67 81 acaactattg ctaaaaatag tgttaagagt gttgctaaat tatgtttgga tgccggcatt 
6841 aattatgtga agtcacccaa attttctaaa ttgttcacaa tcgctatgtg gctattgttg 
6901 ttaagtattt gcttaggttc tctaatctgt gtaactgctg cttttggtgt actcttatct 
6961 aattttggtg ctccttctta ttgtaatggc gttagagaat tgtatcttaa ttcgtctaac 
7021 gttactacta tggatttctg tgaaggttct tttccttgca gcatttgttt aagtggatta 
7081 gactcccttg attcttatcc agctcttgaa accattcagg tgacgatttc atcgtacaag 
7141 ctagacttga caattttagg tctggccgct gagtgggttt tggcatatat gttgttcaca 
7201 aaattctttt atttattagg tctttcagct ataatgcagg tgttctttgg ctattttgct 
72 61 agtcatttca tcagcaattc ttggctcatg tggtttatca ttagtattgt acaaatggca 
7321 cccgtttctg caatggttag gatgtacatc ttctttgctt ctttctacta catatggaag 
7381 agctatgttc atatcatgga tggttgcacc tcttcgactt gcatgatgtg ctataagcgc 
7 4 41 aatcgtgcca cacgcgttga gtgtacaact attgttaatg gcatgaagag atctttctat 
7 5 01 gtctatgcaa atggaggccg tggcttctgc aagactcaca attggaattg tctcaattgt 
7561 gacacatttt gcactggtag tacattcatt agtgatgaag ttgctcgtga tttgtcactc 
7621 cagtttaaaa gaccaatcaa ccctactgac cagtcatcgt atattgttga tagtgttgct 
7 681 gtgaaaaatg gcgcgcttca cctctacttt gacaaggctg gtcaaaagac ctatgagaga 
7741 catccgctct cccattttgt caatttagac aatttgagag ctaacaacac taaaggttca 
7801 ctgcctatta atgtcatagt ttttgatggc aagtccaaat gcgacgagtc tgcttctaag 
78 61 tctgcttctg tgtactacag tcagctgatg tgccaaccta ttctgttgct tgaccaagct 
7921 cttgtatcaa acgttggaga tagtactgaa gtttccgtta agatgtttga tgcttatgtc 
7981 gacacctttt cagcaacttt tagtgttcct atggaaaaac ttaaggcact tgttgctaca 
8041 gctcacagcg agttagcaaa gggtgtagct ttagatggtg tcctttctac attcgtgtca 
8101 gctgcccgac aaggtgttgt tgataccgat gttgacacaa aggatgttat tgaatgtctc 
8161 aaactttcac atcactctga cttagaagtg acaggtgaca gttgtaacaa tttcatgctc 
8221 acctataata aggttgaaaa catgacgccc agagatcttg gcgcatgtat tgactgtaat 

82 81 gcaaggcata tcaatgccca agtagcaaaa agtcacaatg tttcactcat ctggaatgta 

83 41 aaagactaca tgtctttatc tgaacagctg cgtaaacaaa ttcgtactgc tgccaagaag 
8401 aacaacatac cttttacact aacttgtgct acaactagac aggttgtcaa tgtcataact 

84 61 actaaaatct cactcaaggg tggtaagatt gttagtactt gttttaaact tatgcttaag 
8521 gccacattat tgtgcgttct tgctgcattg gtttgttata tcgttatgcc agtacataca 
8581 ttgtcaatcc atgatggtta cacaaatgaa atcattggtt acaaagccat tcaggatggt 
8641 gtcactcgtg acatcatttc tactgatgat tgttttgcaa ataaacatgc tggttttgac 
8701 gcatggttta gccagcgtgg tggttcatac aaaaatgaca aaagctgccc tgtagtagct 
87 61 gctatcatta caagagagat tggtttcata gtgcctggct taccgggtac tgtgctgaga 
8821 gcaatcaatg gtgacttctt gcattttcta cctcgtgttt ttagtgctgt tggcaacatt 
8881 tgctacacac cttccaaact cattgagtat agtgattttg ctacctctgc ttgcgttctt 
8941 gctgctgagt gtacaatttt taaggatgct atgggcaaac ctgtgccata ttgttatgac 
9001 actaatttgc tagagggttc tatttcttat agtgagcttc gtccagacac tcgttatgtg 
9061 cttatggatg gttccatcat acagtttcct aacacttacc tggagggttc tgttagagta 
9121 gtaacaactt ttgatgctga gtactgtaga catggtacat gcgaaaggtc agaagtaggt 
9181 atttgcctat ctaccagtgg tagatgggtt cttaataatg agcattacag agctctatca 
9241 ggagttttct gtggtgttga tgcgatgaat ctcatagcta acatctttac tcctcttgtg 
9301 caacctgtgg gtgctttaga tgtgtctgct tcagtagtgg ctggtggtat tattgccata 
9361 ttggtgactt gtgctgccta ctactttatg aaattcagac gtgtttttgg tgagtacaac 
9421 catgttgttg ctgctaatgc acttttgttt ttgatgtctt tcactatact ctgtctggta 
9481 ccagcttaca gctttctgcc gggagtctac tcagtctttt acttgtactt gacattctat 
9541 ttcaccaatg atgtttcatt cttggctcac cttcaatggt ttgccatgtt ttctcctatt 
9601 gtgccttttt ggataacagc aatctatgta ttctgtattt ctctgaagca ctgccattgg 
9661 ttctttaaca actatcttag gaaaagagtc atgtttaatg gagttacatt tagtaccttc 
9721 gaggaggctg ctttgtgtac ctttttgctc aacaaggaaa tgtacctaaa attgcgtagc 
9781 gagacactgt tgccacttac acagtataac aggtatcttg ctctatataa caagtacaag 
9841 tatttcagtg gagccttaga tactaccagc tatcgtgaag cagcttgctg ccacttagca 
9901 aaggctctaa atgactttag caactcaggt gctgatgttc tctaccaacc accacagaca 
9961 tcaatcactt ctgctgttct gcagagtggt tttaggaaaa tggcattccc gtcaggcaaa 

10021 gttgaagggt gcatggtaca agtaacctgt ggaactacaa ctcttaatgg attgtggttg 
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aagacatgtc 
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13441 caggcactag tactgatgtc gtctacaggg cttttgatat ttacaacgaa aaaagtgctg 
13501 gttttgcaaa gttcctaaaa actaattgct gtcgcttcca ggagaaggat gaggaaggca 
13561 atttattaga ctcttacttt gtagttaaga ggcatactat gtctaactac caacatgaag 
13621 agactattta taacttggtt aaagattgtc cagcggttgc tgtccatgac tttttcaagt 
13681 ttagagtaga tggtgacatg gtaccacata tatcacgtca gcgtctaact aaatacacaa 
13741 tggctgattt agtctatgct ctacgtcatt ttgatgaggg taattgtgat acattaaaag 
13801 aaatactcgt cacatacaat tgctgtgatg atgattattt caataagaag gattggtatg 
13861 acttcgtaga gaatcctgac atcttacgcg tatatgctaa cttaggtgag cgtgtacgcc 
13921 aatcattatt aaagactgta caattctgcg atgctatgcg tgatgcaggc attgtaggcg 
13981 tactgacatt agataatcag gatcttaatg ggaactggta cgatttcggt gatttcgtac 
14041 aagtagcacc aggctgcgga gttcctattg tggattcata ttactcattg ctgatgccca 
14101 tcctcacttt gactagggca ttggctgctg agtcccatat ggatgctgat ctcgcaaaac 
14161 cacttattaa gtgggatttg ctgaaatatg attttacgga agagagactt tgtctcttcg 
14221 accgttattt taaatattgg gaccagacat accatcccaa ttgtattaac tgtttggatg 
14281 ataggtgtat ccttcattgt gcaaacttta atgtgttatt ttctactgtg tttccaccta 
14341 caagttttgg accactagta agaaaaatat ttgtagatgg tgttcctttt gttgtttcaa 
14401 ctggatacca ttttcgtgag ttaggagtcg tacataatca ggatgtaaac ttacatagct 
14 4 61 cgcgtctcag tttcaaggaa cttttagtgt atgctgctga tccagctatg catgcagctt 
14 521 ctggcaattt attgctagat aaacgcacta catgcttttc agtagctgca ctaacaaaca 
14 581 atgttgcttt tcaaactgtc aaacccggta attttaataa agacttttat gactttgctg 
14 641 tgtctaaagg tttctttaag gaaggaagtt ctgttgaact aaaacacttc ttctttgctc 
14701 aggatggcaa cgctgctatc agtgattatg actattatcg ttataatcbg ccaacaatgt 
14761 gtgatatcag acaactccta ttcgtagttg aagttgttga taaatacttt gattgttacg 
14 821 atggtggctg tattaatgcc aaccaagtaa tcgttaacaa tctggataaa tcagctggtt 
14881 tcccatttaa taaatggggt aaggctagac tttattatga ctcaatgagt tatgaggatc 
14941 aagatgcact tttcgcgtat actaagcgta atgtcatccc tactataact caaatgaatc 
15001 ttaagtatgc cattagtgca aagaatagag ctcgcaccgt agctggtgtc tctatctgta 
15061 gtactatgac aaatagacag tttcatcaga aattattgaa gtcaatagcc gccactagag 
15121 gagctactgt ggtaattgga acaagcaagt tttacggtgg ctggcataat atgttaaaaa 
15181 ctgtttacag tgatgtagaa actccacacc ttatgggttg ggattatcca aaatgtgaca 
15241 gagccatgcc taacatgctt aggataatgg cctctcttgt tcttgctcgc aaacataaca 
15301 cttgctgtaa cttatcacac cgtttctaca ggttagctaa cgagtgtgcg caagtattaa 
15361 gtgagatggt catgtgtggc ggctcactat atgttaaacc aggtggaaca tcatccggtg 
15421 atgctacaac tgcttatgct aatagtgtct ttaacatttg tcaagctgtt acagccaatg 
154 81 taaatgcact tctttcaact gatggtaata agatagctga caagtatgtc cgcaatctac 
15541 aacacaggct ctatgagtgt ctctatagaa atagggatgt tgatcatgaa ttcgtggatg 
15601 agttttacgc ttacctgcgt aaacatttct ccatgatgat tctttctgat gatgccgttg 
15661 tgtgctataa cagtaactat gcggctcaag gtttagtagc tagcattaag aactttaagg 
15721 cagttcttta ttatcaaaat aatgtgttca tgtctgaggc aaaatgttgg actgagactg 
15781 accttactaa aggacctcac gaattttgct cacagcatac aatgctagtt aaacaaggag 
15841 atgattacgt gtacctgcct tacccagatc catcaagaat attaggcgca ggctgttttg 
15901 tcgatgatat tgtcaaaaca gatggtacac ttatgattga aaggttcgtg tcactggcta 
15961 ttgatgctta cccacttaca aaacatccta atcaggagta tgctgatgtc tttcacttgt 
16021 atttacaata cattagaaag ttacatgatg agcttactgg ccacatgttg gacatgtatt 
16081 ccgtaatgct aactaatgat aacacctcac ggtactggga acctgagttt tatgaggcta 
16141 tgtacacacc acatacagtc ttgcaggctg taggtgcttg tgtattgtgc aattcacaga 
16201 cttcacttcg ttgcggtgcc tgtattagga gaccattcct atgttgcaag tgctgctatg 
16261 accatgtcat ttcaacatca cacaaattag tgttgtctgt taatccctat gtttgcaatg 
16321 ccccaggttg tgatgtcact gatgtgacac aactgtatct aggaggtatg agctattatt 
16381 gcaagtcaca taagcctccc attagttttc cattatgtgc taatggtcag gtttttggtt 
16441 tatacaaaaa cacatgtgta ggcagtgaca atgtcactga cttcaatgcg atagcaacat 
16501 gtgattggac taatgctggc gattacatac ttgccaacac ttgtactgag agactcaagc 
16561 ttttcgcagc agaaacgctc aaagccactg aggaaacatt taagctgtca tatggtattg 
16621 ccactgtacg cgaagtactc tctgacagag aattgcatct ttcatgggag gttggaaaac 
16681 ctagaccacc attgaacaga aactatgtct ttactggtta ccgtgtaact aaaaatagta 
16741 aagtacagat tggagagtac acctttgaaa aaggtgacta tggtgatgct gttgtgtaca 
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16801 gaggtactac gacatacaag ttgaatgttg 
16861 taatgccact tagtgcacct actctagtgc 
16921 tgtacccaac actcaacatc tcagatgagt 
16981 tcggcatgca aaagtactct acactccaag 
17041 ccatcggact tgctctctat tacccatctg 
17101 cagctgttga tgccctatgt gaaaaggcat 
17161 gaatcatacc tgcgcgtgcg cgcgtagagt 
17221 tagaacagta tgttttctgc actgtaaatg 
17281 tctttgatga aatctctatg gctactaatt 
17341 gtgcaaaaca ctacgtctat attggcgatc 
17401 tgactaaagg cacactagaa ccagaatatt 
174 61 taggtccaga catgttcctt ggaacttgtc 
17521 tgagtgcttt agtttatgac aataagctaa 
17581 tcaaaatgtt ctacaaaggt gttattacac 
17641 aaataggcgt tgtaagagaa tttcttacac 
17701 tctcacctta taattcacag aacgctgtag 
17 7 61 ctgttgattc atcacagggt tctgaatatg 
17 821 cagcacactc ttgtaatgtc aaccgcttca 
17 881 ttttgtgcat aatgtctgat agagatcttt 

17 941 taccacgtcg caatgtggct acattacaag 
16001 gtagtaagat cattactggt cttcatccta 
16061 taaaattcaa gactgaagga ttatgtgttg 
18121 accgtagact catctctatg atgggtttca 
18181 atatgtttat cacccgcgaa gaagctattc 
18241 tagagggctg tcatgcaact agagatgctg 
18301 tttctacagg tgttaactta gtagctgtac 
18361 cagaattcac cagagttaat gcaaaacctc 
18421 cactcatgta taaaggcttg ccctggaatg 
184 81 gtgatacact gaaaggattg tcagacagag 
18541 agcttacatc aatgaagtac tttgtcaaga 

18 601 acaaacgtgc aacttgcttt tctacttcat 
18 661 tgggttttga ctatgtctat aacccattta 
18721 gtaaccttca gagtaaccat gaccaacatt 
18781 gttgtgatgc tatcatgact agatgtttag 
18841 attggtctgt tgaataccct attataggag 
18901 aagtacaaca catggttgtg aagtctgcat 
18961 acattggaaa tccaaaggct atcaagtgtg 
19021 acgatgctca gccatgtagt gacaaagctt 
19081 ctacacatca cgataaattc actgatggtg 
19141 gttacccagc caatgcaatt gtgtgtaggt 
19201 taccaggctg tgatggtggt agtttgtatg 
19261 tcgataaaag tgcatttact aatttaaagc 
19321 cttgtgagtc tcatggcaaa caagtagtgt 
19381 ctacgtgtat tacacgatgc aatttaggtg 
194 41 accgacagta cttggatgca tataatatga 
19501 acaaacaatt tgatacttat aacctgtgga 
19561 atgtggctta taatgttgtt aataaaggac 
19621 tttccatcat taataatgct gtttacacaa 
19681 aaaataagac aacacttcct gttaatgttg 
19741 aaccagtgcc agagattaag atactcaata 
19801 taatctggga ctacaaaaga gaagccccag 
19861 tgactgacat tgccaagaaa cctactgaga 
19921 atggtagagt ggaaggacag gtagaccttt 
19981 cagaaggttc agtcaaaggt ctaacacctt 
20041 gagtcacatt aattggagaa tcagtaaaaa 
20101 gcattattca acagttgcct gaaacctact 



gtgattactt tgtgttgaca tctcacactg 
cacaagagca ctatgtgaga attactggcfc 
tttctagcaa tgttgcaaat tatcaaaagg 
gaccacctgg tactggtaag agtcattttg 
ctogcatagt gtatacggca tgctctcatg 
taaaatattt gcccatagat aaatgtagta 
gttttgataa attcaaagtg aattcaacac 
cattgccaga aacaactgct gacattgtag 
atgacttgag tgttgtcaat gctagacttc 
ctgctcaatt accagccccc cgcacattgc 
ttaattcagt gtgcagactt atgaaaacaa 
gccgttgtcc tgctgaaatt gttgacactg 
aagcacacaa ggataagtca gctcaatgct 
atgatgtttc atctgcaatc aacagacctc 
gcaatcctgc ttggagaaaa gctgttttta 
cttcaaaaat cttaggattg cctacgcaga 
actatgtcat attcacacaa actactgaaa 
atgtggctat cacaagggca aaaattggca 
atgacaaact gcaatttaca agtctagaaa 
cagaaaatgt aactggactt tttaaggact 
cacaggcacc tacacacctc agcgttgata 
acataccagg cataccaaag gacatgacct 
aaatgaatta ccaagtcaat ggttacccta 
gtcacgttcg tgcgtggatt ggctttgatg 
tgggtactaa cctacctctc cagctaggat 
cgactggtta tgttgacact gaaaataaca 
caccaggtga ccagtttaaa catcttatac 
tagtgcgtat taagatagta caaatgctca 
tcgtgttcgt cctttgggcg catggctttg 
ttggacctga aagaacgtgt tgtctgtgtg 
cagatactta tgcctgctgg aatcattctg 
tgattgatgt tcagcagtgg ggctttacgg 
gccaggtaca tggaaatgca catgtggcta 
cagtccatga gtgctttgtt aagcgcgttg 
atgaactgag ggttaattct gcttgcagaa 
tgcttgctga taagtttcca gttcttcatg 
tgcctcaggc tgaagtagaa tggaagttct 
acaaaataga ggaactcttc tattcttatg 
tttgtttgtt ttggaattgt aacgttgatc 
ttgaoacaag agtcttgtca aacttgaact 
tgaataagca tgcattccac actccagctt 
aattgccttt cttttactat tctgatagtc 
cggatattga ttatgttcca ctcaaatctg 
gtgctgtttg cagacaccat gcaaatgagt 
tgatttctgc tggatttagc ctatggattt 
atacatttac caggttacag agtttagaaa 
actttgatgg acacgccggc gaagcacctg 
aggtagatgg tattgatgtg gagatctttg 
catttgagct ttgggctaag cgtaacatta 
atttgggtgt tgatatcgct gctaatactg 
cacatgtatc tacaataggt gtctgcacaa 
gtgcttgttc ttcacttact gtcttgtttg 
ttagaaacgc ccgtaatggt gttttaataa 
caaagggacc agcacaagct agcgtcaatg 
cacagtttaa ctactttaag aaagtagacg 
ttactcagag cagagactta gaggatttta 
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20161 agcccagatc acaaatggaa actgactttc tcgagctcgc tatggatgaa ttcatacagc 
20221 gatataagct cgagggctat gccttcgaac acatcgttta tggagatttc agtcatggac 
202 81 aacttggcgg tcttcattta atgataggct tagccaagcg ctcacaagat tcaccactta 
20341 aattagagga ttttatccct atggacagca cagtgaaaaa ttacttcata acagatgcgc 
20401 aaacaggttc atcaaaatgt gtgtgttctg tgattgatct tttacttgat gactttgtcg 
20461 agataataaa gtcacaagat ttgtcagtga tttcaaaagt ggtcaaggtt acaattgact 
20521 atgctgaaat ttcattcatg ctttggtgta aggatggaca tgttgaaacc ttctacccaa 
20581 aactacaagc aagtcaagcg tggcaaccag gtgttgcgat gcctaacttg tacaagatgc 
20641 aaagaatgct tcttgaaaag tgtgaccttc agaattatgg tgaaaatgct gttataccaa 
20701 aaggaataat gatgaatgtc gcaaagtata ctcaactgtg tcaatactta aatacactta 
20761 ctttagctgt accctacaac atgagagtta ttcactttgg tgctggctct gataaaggag 
20821 ttgcaccagg tacagctgtg ctcagacaat ggttgccaac tggcacacta cttgtcgatt 
208 81 cagatcttaa tgacttcgtc tccgacgcag attctacttt aattggagac tgtgcaacag 
20941 tacatacggc taataaatgg gaccttatta ttagcgatat gtatgaccct aggaccaaac 
21001 atgtgacaaa agagaatgac tctaaagaag ggtttttcac ttatctgtgt ggatttataa 
21061 agcaaaaact agccctgggt ggttctatag ctgtaaagat aacagagcat tcttggaatg 
21121 ctgaccttta caagcttatg ggccatttct catggtggac agcttttgtt acaaatgtaa 
21181 atgcatcatc atcggaagca tttttaattg gggctaacta tcttggcaag ccgaaggaac 
21241 aaattgatgg ctataccatg catgctaact acattttctg gaggaacaca aatcctatcc 
21301 agttgtcttc ctattcactc tttgacatga gcaaatttcc tcttaaatta agaggaactg 
21361 ctgtaatgtc tcttaaggag aatcaaatca atgatatgat ttattctctt ctggaaaaag 
21421 gtaggcttat cattagagaa aacaacagag ttgtggtttc aagtgatatt cttgttaaca 
21481 actaaacgaa catgtttatt ttcttattat ttcttactct cactagtggt agtgaccttg 
21541 accggtgcac cacttttgat gatgttcaag ctcctaatta cactcaacat acttcatcta 
21601 tgaggggggt ttactatcct gatgaaattt ttagatcaga cactctttat ttaaotcagg 
21661 atttatttct tccattttat tctaatgtta cagggtttca tactattaat catacgtttg 
21721 gcaaccctgt catacctttt aaggatggta tttattttgc tgccacagag aaatcaaatg 
21781 ttgtccgtgg ttgggttttt ggttctacca tgaacaacaa gtcacagtcg gtgattatta 
21841 ttaacaattc tactaatgtt gttatacgag catgtaactt tgaattgtgt gacaaccctt 
21901 tctttgctgt ttctaaaccc atgggtacac agacacatac tatgatattc gataatgcat 
21961 ttaattgcac tttcgagtac atahctgatg ccttttcgct tgatgtttca gaaaagtcag 
22021 gtaattttaa acacttacga gagtttgtgt ttaaaaataa agatgggttt ctctatgttt 
22081 ataagggcta tcaacctata gatgtagttc gtgatctacc ttctggtttt aacactttga 
22141 aacctatttt taagttgcct cttggtatta acattacaaa ttttagagcc attcttacag 
22201 ccttttcacc tgctcaagac atttggggca cgtcagctgc agcctatttt gttggctatt 
222 61 taaagccaac tacatttatg ctcaagtatg atgaaaatgg tacaatcaca gatgctgttg 
22321 attgttctca aaatccactt gctgaactca aatgctctgt taagagcttt gagattgaca 
22381 aaggaattta ccagacctct aatttcaggg ttgttccctc aggagatgtt gtgagattcc 
224 41 ctaatattac aaacttgtgt ccttttggag aggtttttaa tgctactaaa ttcccttctg 
22501 tctatgcatg ggagagaaaa aaaatttcta attgtgttgc tgattactct gtgctctaca 
22561 actcaacatt tttttcaacc tttaagtgct atggcgtttc tgccactaag ttgaatgatc 
22621 tttgcttctc caatgtctat gcagattctt ttgtagtcaa gggagatgat gtaagacaaa 
22 681 tagcgccagg acaaactggt gttattgctg attataatta taaattgcca gatgatttca 
22741 tgggttgtgt ccttgcttgg aatactagga acattgatgc tacttcaact ggtaattata 
22801 attataaata taggtatctt agacatggca agcttaggcc ctttgagaga gacatatcta 
228 61 atgtgccttt ctcccctgat ggcaaacctt gcaccccacc tgctcttaat tgttattggc 
22921 cattaaatga ttatggtttt tacaccacta ctggcattgg ctaccaacct tacagagttg 
22981 tagtactttc ttttgaactt ttaaatgcac cggccacggt ttgtggacca aaattatcca 
23041 ctgaccttat taagaaccag tgtgtcaatt ttaattttaa tggactcact ggtactggtg 
23101 tgttaactcc ttcttcaaag agatttcaac catttcaaca atttggccgt gatgtttctg 
23161 atttcactga ttccgttcga gatcctaaaa catctgaaat attagacatt tcaccttgct 
23221 cttttggggg tgtaagtgta attacacctg gaacaaatgc ttcatctgaa gttgctgttc 
23281 tatatcaaga tgttaactgc actgatgttt ctacagcaat tcatgcagat caactcacac 
23341 cagcttggcg catatattct actggaaaca atgtattcca gactcaagca ggctgtctta 
234 01 taggagctga gcatgtcgac acttcttatg agtgcgacat tcctattgga gctggcattt 
234 61 gtgctagtta ccatacagtt tctttattac gtagtactag ccaaaaatct attgtggctt 
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23521 atactatgtc tttaggtgct gatagttcaa ttgcttactc taataacacc attgctatac 
23581 ctactaactt ttcaattagc attactacag aagtaatgcc tgtttctatg gctaaaacct 
23 641 ccgtagattg taatatgtac atctgcggag attctactga atgtgctaat ttgcttctcc 
23701 aatatggtag cttttgcaca caactaaatc gtgcactctc aggtattgct gctgaacagg 
237 61 atcgcaacac acgtgaagtg ttcgctcaag tcaaacaaat gtacaaaacc ccaactttga 
23821 aatattttgg tggttttaat ttttcacaaa tattacctga ccctctaaag ccaactaaga 
23881 ggtcttttat tgaggacttg ctctttaata aggtgacact cgctgatgct ggcttcatga 

23 941 agcaatatgg cgaatgccta ggtgatatta atgctagaga tctcatttgt gcgcagaagt 
24001 tcaatggact tacagtgttg ccacctctgc tcactgatga tatgattgct gcctacactg 
24061 ctgctctagt tagtggtact gccactgctg gatggacatt tggtgctggc gctgctcttc 
24121 aaataccttt tgctatgcaa atggcatata ggttcaatgg cattggagtt acccaaaatg 
24181 ttctctatga gaaccaaaaa caaatcgcca accaatttaa' caaggcgatt agtcaaattc 
24241 aagaatcact tacaacaaca tcaactgcat tgggcaagct gcaagacgtt gttaaccaga 

24 301 atgctcaagc attaaacaca cttgttaaac aacttagctc taattttggt gcaatttcaa 
24361 gtgtgctaaa tgatatcctt tcgcgacttg ataaagtcga ggcggaggta caaattgaca 
24 421 ggttaattac aggcagactt caaagccttc aaacctatgt aacacaacaa ctaatcaggg 
24 4 81 ctgctgaaat cagggcttct gctaatcttg ctgctactaa aatgtctgag tgtgttcttg 
24541 gacaatcaaa aagagttgac ttttgtggaa agggctacca ccttatgtcc ttcccacaag 
24 601 cagccccgca tggtgttgtc ttcctacatg tcacgtatgt gccatcccag gagaggaact 
24 661 tcaccacagc gccagcaatt tgtcatgaag gcaaagcata cttccctcgt gaaggtgttt 
24721 ttgtgtttaa tggcacttct tggtttatta cacagaggaa cttcttttct ccacaaataa 
24781 ttactacaga caatacattt gtctcaggaa attgtgatgt cgttattggc atcattaaca 
24841 acacagttta tgatcctctg caacctgagc ttgactcatt caaagaagag ctggacaagt 
24901 acttcaaaaa tcatacatca ccagatgttg atcttggcga catttcaggc attaacgctt 
24961 ctgtcgtcaa cattcaaaaa gaaattgacc gcctcaatga ggtcgctaaa aatttaaatg 
25021 aatcactcat tgaccttcaa gaattgggaa aatatgagca atatattaaa tggccttggt 
25081 atgtttggct cggcttcatt gctggactaa ttgccatcgt catggttaca atcttgcttt 
25141 gttgcatgac tagttgttgc agttgcctca agggtgcatg ctcttgtggt tcttgctgca 
25201 agtttgatga ggatgactct gagccagttc tcaagggtgt caaattacat tacacataaa 
252 61 cgaacttatg gatttgttta tgagattttt tactcttgga tcaattactg cacagccagt 
25321 aaaaattgac aatgcttctc ctgcaagtac tgttcatgct acagcaacga taccgctaca 
25381 agcctcactc cctttcggat ggcttgttat tggcgttgca tttcttgctg tttttcagag 
25441 cgctaccaaa ataattgcgc tcaataaaag atggcagcta gccctttata agggcttcca 
25501 gttcatttgc aatttactgc tgctatttgt taccatctat tcacatcttt tgcttgtcgc 
255 61 tgcaggtaag gaggcgcaat ttttgtacct ctatgccttg atatattttc tacaatgcat 
25621 caacgcatgt agaattatta tgagatgttg gctttgttgg aagtgcaaat ccaagaaccc 
25681 attactttat gatgccaact actttgtttg ctggcacaca cataactatg actactgtat 
25741 accatataac agtgtcacag atacaattgt cgttactgaa ggtgacggca tttcaacacc 
25801 aaaactcaaa gaagactacc aaattggtgg ttattctgag gataggcact caggtgttaa 
25861 agactatgtc gttgtacatg gctatttcac cgaagtttac taccagcttg agtctacaca 
25921 aattactaca gacactggta ttgaaaatgc tacattcttc atctttaaca agcttgttaa 
25981 agacccaccg aatgtgcaaa tacacacaat cgacggctct tcaggagttg ctaatccagc 
2 6041 aatggatcca atttatgatg agccgacgac gactactagc gtgcctttgt aagcacaaga 
26101 aagtgagtac gaacttatgt actcattcgt ttcggaagaa acaggtacgt taatagttaa 
26161 tagcgtactt ctttttcttg ctttcgtggt attcttgcta gtcacactag ccatccttac 
26221 tgcgcttcga ttgtgtgcgt actgctgcaa tattgttaac gtgagtttag taaaaccaac 
26281 ggtttacgtc taotcgcgtg ttaaaaatct gaactcttct gaaggagttc ctgabcttct 
26341 ggtctaaacg aactaactat tattattatt ctgtttggaa ctttaacatt gcttatcatg 
2 6401 gcagacaacg gtactattac cgttgaggag cttaaacaac tcctggaaca atggaaccta 
26461 gtaataggtt tcctattcct agcctggatt atgttactac aatttgccta ttctaatcgg 
26521 aacaggtttt tgtacataat aaagcttgtt ttcctctggc tcttgtggcc agtaacactt 
26581 gcttgttttg tgcttgctgt tgtctacaga attaattggg tgactggcgg gattgcgatt 
2 6641 gcaatggctt gtattgtagg cttgatgtgg cttagctact tcgttgcttc cttcaggctg 
26701 tttgctcgta cccgctcaat gtggtcattc aacccagaaa caaacattct tctcaatgtg 
2 67 61 cctctccggg ggacaattgt gaccagaccg ctcatggaaa gtgaacttgt cattggtgct 
26821 gtgatcattc gtggtcactt gcgaatggcc ggacactccc tagggcgctg tgacattaag 
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26881 gacctgccaa aagagatcac tgtggctaca tcacgaacgc tttcttatta caaattagga 
26941 gcgtcgcagc gtgtaggcac tgattcaggt tttgctgcat acaaccgcta ccgtattgga 
27001 aactataaat taaatacaga ccacgccggt agcaacgaca atattgcttt gctagtacag 
27061 taagtgacaa cagatgtttc atcttgttga cttccaggtt acaatagcag agatattgat 
27121 tatcattatg aggactttca ggattgctat ttggaatctt gacgttataa taagttcaat 
27181 agtgagacaa ttatttaagc ctctaactaa gaagaattat tcggagttag atgatgaaga 
27241 acctatggag ttagattatc cataaaacga acatgaaaat tatfcctcttc ctgacattga 
27301 ttgtatttac atcttgcgag ctatatcact atcaggagtg tgttagaggt acgactgtac 
27361 tactaaaaga accttgccca tcaggaacat acgagggcaa ttcaccattt caccctcttg 
27421 ctgacaataa atttgcacta acttgcacta gcacacactt tgcttttgct tgtgctgacg 
27 4 81 gtactcgaca tacctatcag ctgcgtgcaa gatcagtttc accaaaactt ttcatcagac 
27541 aagaggaggt tcaacaagag ctctactcgc cactttttct cattgttgct gctctagtat 
27 601 ttttaatact ttgcttcacc attaagagaa agacagaatg aatgagctca ctttaattga 
27661 cttctatttg tgctttttag cctttctgct attccttgtt ttaataatgc ttattatatt 
27721 ttggttttca ctcgaaatcc aggatctaga agaaccttgt accaaagtct aaacgaacat 
27781 gaaacttctc attgttttga cttgtatttc tctatgcagt tgcatatgca ctgtagtaca 
27 841 gcgctgtgca tctaataaac ctcatgtgct tgaagatcct tgtaaggtac aacactaggg 
27 901 gtaatactta tagcactgct tggctttgtg ctctaggaaa ggttttacct tttcatagat 

27 961 ggcacactat ggttcaaaca tgcacaccta atgttactat caactgtcaa gatccagctg 

28 021 gtggtgcgct tatagctagg tgttggtacc ttcatgaagg tcaccaaact gctgcattta 
28081 gagacgtact tgttgtttta aataaacgaa caaattaaaa tgtctgataa tggaccccaa 
28141 tcaaaccaac gtagtgcccc ccgcattaca tttggtggac ccacagattc aactgacaat 
28201 aaccagaatg gaggacgcaa tggggcaagg ccaaaacagc gccgacccca aggtttaccc 
282 61 aataatactg cgtcttggtt cacagctctc actcagcatg gcaaggagga acttagattc 
2 8321 cctcgaggcc agggcgttcc aatcaacacc aatagtggtc cagatgacca aattggctac 
28381 taccgaagag ctacccgacg agttcgtggt ggtgacggca aaatgaaaga gctcagcccc 
2 8441 agatggtact tctattacct aggaactggc ccagaagctt cacttcccta cggcgctaac 
28501 aaagaaggca tcgtatgggt tgcaactgag ggagccttga atacacccaa agaccacatt 
28561 ggcacccgca atcctaataa caatgctgcc accgtgctac aacttcctca aggaacaaca 
28621 ttgccaaaag gcttctacgc agagggaagc agaggcggca gtcaagcctc ttctcgctcc 
28 681 tcatcacgta gtcgcggtaa ttcaagaaat tcaactcctg gcagcagtag gggaaattct 
28741 cctgctcgaa tggctagcgg aggtggtgaa actgccctcg cgctattgct gctagacaga 
28801 ttgaaccagc ttgagagcaa agtttctggt aaaggccaac aacaacaagg ccaaactgtc 
288 61 actaagaaat ctgctgctga ggcatctaaa aagcctcgcc aaaaacgtac tgccacaaaa 
28921 cagtacaacg tcactcaagc atttgggaga cgtggtccag aacaaaccca aggaaatttc 
28981 ggggaccaag acctaatcag acaaggaact gattacaaac attggccgca aattgcacaa 
29041 tttgctccaa gtgcctctgc attctttgga atgtcacgca ttggcatgga agtcacacct 
29101 tcgggaacat ggctgactta tcatggagcc attaaattgg atgacaaaga tccacaattc 
29161 aaagacaacg tcatactgct gaacaagcac attgacgcat acaaaacatt cccaccaaca 
2 9221 gagcctaaaa aggaoaaaaa gaaaaagact gatgaagctc agcctttgcc gcagagacaa 
2 9281 aagaagcagc ccactgtgac tcttcttcct gcggctgaca tggatgattt ctccagacaa 
2 9341 cttcaaaatt ccatgagtgg agcttctgct gattcaactc aggcataaac actcatgatg 
2 9401 accacacaag gcagatgggc tatgtaaacg ttttcgcaat tccgtttacg atacatagtc 
2 94 61 tactcttgtg cagaatgaat tctcgtaact aaacagcaca agtaggttta gttaacttta 
2 9521 atctcacata gcaatcttta atcaatgtgt aacattaggg aggacttgaa agagccacca 
29581 cattttcatc gaggccacgc ggagtacgat cgagggtaca gtgaataatg ctagggagag 
29641 ctgcctatat ggaagagccc taatgtgtaa aattaatttt agtagtgcta tccccatgtg 
2 9701 attttaatag cttcttagga gaatgacaaa aaaaaaaaaa aa 
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1 - ATATTAGGTTTTTACCTACCCAGGAAAAGCCAACCAACCTCGATCTCTTGTAGATCTGTT - 60 
-ILGFYLPRKSQPTSISCRSV 

- Y*VFTYPGKANQPRSLVDLF 

IRFLPTQEKPTNLDLL * ICS 
61 - CTCTAAACGAACTTTAAAATCTGTGTAGCTGTCGCTCGGCTGCATGCCTAGTGCACCTAC -12 0 

- L * TNFKICVAVARLHA* C T Y 
-SKRTLKSV*LSLGCMPSAPT 

LNEL*NLCSCRSAACLVHLR 
121 - GCAGTATAAACAATAATAAATTTTACTGTCGTTGACAAGAAACGAGTAAGTCGTCCCTCT - 18 0 

- A V * T I INFTVVDKKRVTRPS 
~ Q Y K Q * * ILLSLTRNE * L V P L 

SINNNKFYCR*QETSNSSLF 
181 - TCTGCAGACTGCTTACGGTTTCGTCCGTGTTGCAGTCGATCATCAGCATACCTAGGTTTC - 24 0 
-SADCLRFRPCCSRS SAYLGF 
-LQTAYGFVRVAVDHQHT * V S 
CRLLTVSSVLQSIISIPRFR 
241 - GTCCGGGTGTGACCGAAAGGTAAGATGGAGAGCCTTGTTCTTGGTGTCAACGAGAAAACA - 300 

- V R V * PKGKMESLVLGVNEKT 
-SGCDRKVRWRALFLVSTRKH 

PGVTER* DGEPCSWCQRENT 
301 - CACGTCCAACTCAGTTTGCCTGTCCTTCAGGTTAGAGACGTGCTAGTGCGTGGCTTCGGG - 360 
,-HVQL SLPVLQVR DVLVRGFG 
-TSNSVCLSFRLETC*CVASG 
RPTQFACPSG*RRASAWLRG 
361 - GACTCTGTGGAAGAGGCCCTATCGGAGGCACGTGAACACCTCAAAAATGGCACTTGTGGT - 420 
-DSVEEALSEAREHLKNGTCG 
-TLWKRPYRRHVNTSKMALVV 
LCGRGPIGGT* TPQKWHLWS 
421 - CTAGTAGAGCTGGAAAAAGGCGTACTGCCCCAGCTTGAACAGCCCTATGTGTTCATTAAA - 480 
-LVELEKGVLPQLEQPYVFIK 

- * *SWKKAYCPSLNSPMCSLN 

SRAGKRRTAPA*TALCVH*T 
481 - CGTTCTGATGCCTTAAGCACCAATCACGGCCACAAGGTCGTTGAGCTGGTTGCAGAAATG - 540 

- R S DALSTNHGHKVVELVAEM 
-VLMP*APITATRSLSWLQKW 

F*CLKHQSRPQGR*AGCRNG 
541 - GACGGCATTCAGTACGGTCGTAGCGGTATAACACTGGGAGTACTCGTGCCACATGTGGGC - 600 

- D G I QYGRSGITLGVLVPHVG 
-TAFSTVVAV*HWEYSCHMWA 

RHSVRS*RYNTGSTRATCGR 
601 - GAAACCCCAATTGCATACCGCAATGTTCTTCTTCGTAAGAACGGTAATAAGGGAGCCGGT - 660 
-ETP IAYRNVLLRKNGNKGAG 
-KPQLHTAMFFFVRTVIREPV 
NPNCIPQCSSS*ER**GSRW 
661 - GGTCATAGCTATGGCATCGATCTAAAGTCTTATGACTTAGGTGACGAGGTTGGCACTGAT - 720 
-GHSYGIDLKSYDLGDELGTD 
-VIAMASI*SLMT*VTSL ALI 
S*LWHRSKVL*LR*RAWH*S 
721 - CCCATTGAAGATTATGAACAAAACTGGAACACTAAGCATGGCAGTGGTGCACTCCGTGAA - 780 

- P IE DYEQNWNTKHGSGALRE 

- PLKIMNKTGTLSMAVVHSVN 

H *RL*TKLEH*AWQWCTP*T 
781 - CTCACTCGTGAGCTCAATGGAGGTGCAGTCACTCGCTATGTCGACAACAATTTCTGTGGC - 8 40 
-LTRELNGGAVTRYVDNNFCG 
-SLVSSMEVQSLAMSTTISVA 

HS*AQWRCSHSLCRQQFLWP 



FIG. 11 



WO 2004/085633 



PCT/CN2004/000248 



20/90 

841 - CCAGATGGGTACCCTCTTGATTGCATCAAAGATTTTCTCGCACGCGCGGGCAAGTCAATG ~ 900 
-PDGYPLDCIKDFLARAGKSM 
-QMGTLLIASKIFSHARASQC 
RWVPS *LHQRF SRTRGQVNV 
901 - TGCACTCTTTCCGAACAACTTGATTACATCGAGTCGAAGAGAGGTGTCTACTGCTGCCGT - 960 
-CTLSEQLDYIESKRGVYCCR 
-ALFPNNLITSSRREVSTAAV 
HSFRTT*LHRVEERCLLLP* 
961 - GACCATGAGCATGAAATTGCCTGGTTCACTGAGCGCTCTGATAAGAGCTACGAGCACCAG - 1020 
-DHEHE IAWFTERSDKSYEHQ 
-TMSMKLPGSLSALIRATSTR 
P*A*NCLVH*AL**ELRAPD 
1021 - ACACCCTTCGAAATTAAGAGTGCCAAGAAATTTGACACTTTCAAAGGGGAATGCCCAAAG - 1080 
~ T P F E IKSAKKFDTFKGE CPK 
-HPSKLRVPRNLTLSKGNAQS 
TLRN* ECQEI*HFQRGMPKV 
1081 - TTTGTGTTTCCTCTTAACTCAAAAGTCAAAGTCATTCAACCACGTGTTGAAAAGAAAAAG - 114 0 
- F V F PLNSKVKVIQPRVE KKK 
-LCFLLTQKSKSFNHVLKRKR 
CVSS*LKSQSHSTTC*KEKD 
1141 - ACTGAGGGTTTCATGGGGCGTATACGCTCTGTGTACCCTGTTGCATCTCCACAGGAGTGT - 1200 
-TEGFMGRIRSVYPVAS PQEC 
-LRVSWGVYALCTLLHLHRSV 
*GFHGAYTLCVPCCISTGV* 
1201 ~ AACAATATGCACTTGTCTACCTTGATGAAATGTAATCATTGCGATGAAGTTTCATGGCAG - 1260 
-NNMHLSTLMKCNHCDEVSWQ 
-TICTCLP**NVIIAMKFHGR 
QYALVYLDEM*SLR* S FMAD 
1261 - ACGTGCGACTTTCTGAAAGCCACTTGTGAACATTGTGGCACTGAAAATTTAGTTATTGAA - 1320 
-TCDFLKATCEHCGTENLVIE 
-RATF*KPLVNIVALKI*LLK 
VRLSESHL*TIiWH*KFSY*R 
1321 - GGACCTACTACATGTGGGTACCTACCTACTAATGCTGTAGTGAAAATGCCATGTCCTGCC - 1380 
-GPTTCGYLPTNAVVKM P CPA 
DLLHVGTYLLML* * K C H V L P 
TYYMWVPTY*CCSENAMSCL 
1381 - TGTCAAGACCCAGAGATTGGACCTGAGCATAGTGTTGCAGATTATCACAACCACTCAAAC - 144 0 
-CQDPE IGPEHSVADYHN HSN 
-VKTQRLDLSXVLQIITTTQT 
SRPRDWT*A*CCRLSQPLKH 
1441 - ATTGAAACTCGACTCCGCAAGGGAGGTAGGACTAGATGTTTTGGAGGCTGTGTGTTTGCC - 150 0 
-IETRLRKGGRTRCFGGCVFA 
-LKLDSAREVGLDVLEAVCLP 
*NSTPQGR* D*MFWRLCVCL 
1501 - TATGTTGGCTGCTATAATAAGCGTGCCTACTGGGTTCCTCGTGCTAGTGCTGATATTGGC - 1560 
-YVGCYNKRAYWVPRASA DIG 
-MLAAI I SVPTGFLVLVLILA 
CWLL**ACLLGSSC*C*YWL 
1561 - TCAGGCCATACTGGCATTACTGGTGACAATGTGGAGACCTTGAATGAGGATCTCCTTGAG - 162 0 
-SGHTGITGDNVETLNEDLLE 
-QAILALLVTMWRP^MRISLR 
RPYWHYW*QCGDLE*GSP*D 
1621 - ATACTGAGTCGTGAACGTGTTAACATTAACATTGTTGGCGATTTTCATTTGAATGAAGAG - 1680 
-ILSRERVNINIVGDFHLNEE 
-Y*VVNVLTLTLLAIFI * M K R 
TES *TC*H*HCWRFSFE*RG 



FIG. 11 Con't 



WO 2004/085633 



PCT/CN2004/000248 



21/90 ■ 

1681 - GTTGCCATCATTTTGGCATCTTTCTCTGCTTCTACAAGTGCCTTTATTGACACTATAAAG - 1740 

- V A IILAS FSASTSAFI DT IK 
-LPSFWHLSLLLQVPLLTL*R 

CHHFG I FLCFYKCLY * . H YKE 
1741 - AGTCTTGATTACAAGTCTTTCAAAACCATTGTTGAGTCCTGCGGTAACTATAAAGTTACC - 1800 
-SLDYKSFKTIVESCGNYKVT 
-VLITSLSKPLLSPAVTIKLP 
S * LQVFQNHC*VLR*L* SYQ 
1801 - AAGGGAAAGCCCGTAAAAGGTGCTTGGAACATTGGACAACAGAGATCAGTTTTAACACCA - 18 60 
-KGKPVKGAWNIGQQRSVLTP 
-RESP*KVLGTLDNRDQF*HH 
GKARKRCLEHWTTEISFNTT 
1861 - CTGTGTGGTTTTCCCTCACAGGCTGCTGGTGTTATCAGATCAATTTTTGCGCGCACACTT - 1920 

- L C GFP SQAAGVI RS I FARTL 
-CVVFPHRLLVLSDQFLRAHL 

VWFSLTGCWCYQINFCAHT* 
1921 - GATGCAGCAAACCACTCAATTCCTGATTTGCAAAGAGCAGCTGTCACCATACTTGATGGT - 1980 
-DAANHS I PDLQRAAVT I LDG 
-MQQTTQFLICKEQLSPYLMV 
CSKPLNS*FAKSSCHHT*WY 
1981 - ATTTCTGAACAGTCATTACGTCTTGTCGACGCCATGGTTTATACTTCAGACCTGCTCACC - 204 0 
-IS EQSLRLVDAMVYTS DLLT 
-FLNSHYVLSTPWFILQTCSP 
F * T V I TSCRRHGLYFRPAHQ 
2041 - AACAGTGTCATTATTATGGCATATGTAACTGGTGGTCTTGTACAACAGACTTCTCAGTGG - 2100 

- N S V I IMAYVTGGLVQQT SQW 
-TVSLLWHM*LVVLYNRLLSG 

QCHYYGICNWWSCTTDFSVV 
2101 - TTGTCTAATCTTTTGGGCACTACTGTTGAAAAACTCAGGCCTATCTTTGAATGGATTGAG - 2160 
-LSNLLGTTVEKLRPIFEWIE 
-CLIFWALLLKNSGLSLNGLR 
V*SFGHYC*KTQAYL*MD*G 
2161 - GCGAAACTTAGTGCAGGAGTTGAATTTCTCAAGGATGCTTGGGAGATTCTCAAATTTCTC - 2 220 
-AKLSAGVEFLKDAWEI LKFL 
-RNLVQELNFSRMLGRFSNFS 
ET*CRS*ISQGCLGDSQISH 
2221 - ATTACAGGTGTTTTTGACATCGTCAAGGGTCAAATACAGGTTGCTTCAGATAACATCAAG - 2280 
-I TGVFDIVKGQI QVAS DNIK 
-LQVFLTSSRVKYRLLQITSR 
YRCF*HRQGSNTGCFR*HQG 
2281 - GATTGTGTAAAATGCTTCATTGATGTTGTTAACAAGGCACTCGAAATGTGCATTGATCAA - 2340 
-DCVKCFI DVVNKALEMC I DQ 
-IV*NASLMLLTRHSKCALIK 
LCKMLH*CC*QGTRNVH*SS 
2 341 - GTCACTATCGCTGGCGCAAAGTTGCGATCACTCAACTTAGGTGAAGTCTTCATCGCTCAA - 2 400 
-VTIAGAKLRSLNLGEVFIAQ 
-SLSLAQSCDHST*VKSSSLK 
HYRWRKVAITQLR*SLHRSK 
2 401 - AGCAAGGGACTTTACCGTCAGTGTATACGTGGCAAGGAGCAGCTGCAACTACTCATGCCT - 24 60 
-SKGLYRQC I RGKEQLQL LMP 
-ARDFTVSVYVARSSCNYSCL 
QGTLPSVYTWQGAAATTHAS 
2 4 61 - CTTAAGGCACCAAAAGAAGTAACCTTTCTTGAAGGTGATTCACATGACACAGTACTTACC - 2520 
-LKAPKEVTFLEG DSH DTVLT 
-LRHQKK* PFLKVIHMTQYLP 
*GTKRSNIiS*R*FT*HSTYL 
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2521 - TCTGAGGAGGTTGTTCTCAAGAACGGTGAACTCGAAGCACTCGAGACGCCCGTTGATAGC - 2580 
-SEEVVLKNGELEALETPVDS 
-LRRLFSRTVNSKHSRRPLIA 
*GGCSQER*TRSTRDAR**L 

2581 - TTCACAAATGGAGCTATCGTCGGCACACCAGTCTGTGTAAATGGCCTCATGCTCTTAGAG - 2640 
-FTNGA IVGTPVCVNGLMLLE 
-SQMELSSAHQSV*MASCS*R 
HKWS YRRHTSLCKWPHALRD 

2 641 - ATTAAGGACAAAGAACAATACTGCGCATTGTCTCCTGGTTTACTGGCTACAAACAATGTC - 27 00 
-IKDKEQYCALSPGLLATNNV 
-LRTKNNTAHCLLVYWLQTMS 

* GQRTILRIVSWFTGYKQCL 

2701 - TTTCGCTTAAAAGGGGGTGCACCAATTAAAGGTGTAACCTTTGGAGAAGATACTGTTTGG - 2760 
-FRLKGGAP IKGVTFGE DTVW 
-FA*KGVHQLKV* PLEKILFG 
SLKRGCTN*RCNLWRRYCLG 
2761 - GAAGTTCAAGGTTACAAGAATGTGAGAATCACATTTGAGCTTGATGAACGTGTTGACAAA - 2820 
-EVQGYKNV RI TFELDE RVDK 
-KFKVTRM* ESHLS LMNVLTK 
SSRLQECENHI'*A**TC*QS 
2 821 - GTGCTTAATGAAAAGTGCTCTGTCTACACTGTTGAATCCGGTACCGAAGTTACTGAGTTT - 2880 
-VLNEKCSVYTVESGTEVTEF 
-CLMKSALS TLLNPVPKLLSL 
A**KVLCLHC*IRYRSY*VC 
2881 - GCATGTGTTGTAGCAGAGGCTGTTGTGAAGACTTTACAACCAGTTTCTGATCTCCTTACC - 2 94 0 
-ACVVAEAVVKTLQPVS DLLT 
-HVL* QRLL*RLYNQFLISLP 
MCCSRGCCEDFTTSF^SPYQ 
2941 - AACATGGGTATTGATCTTGATGAGTGGAGTGTAGCTACATTCTACTTATTTGATGATGCT - 3000 
-NMGIDLDEWSVATF YLFDDA 
-TWVLILMSGV*LHSTYLMML 
HGY*S**VECSYILLI**CW 
3001 - GGTGAAGAAAACTTTTCATCACGTATGTATTGTTCCTTTTACCCTCCAGATGAGGAAGAA - 3060 
-GEENFSSRMYCS FYPP DEEE 
-VKKTFHHVCIVPFTLQMRKK 
*RKLFITYVLFLLPSR*GRR 
3061 - GAGGACGATGCAGAGTGTGAGGAAGAAGAAATTGATGAAACCTGTGAACATGAGTACGGT - 3120 
-EDDAECEEEEI DETCEHEYG 

- RTMQSVRKKKLMKPVNMSTV 

GRCRV*GRRN**NL*T*VRY 
3121 - ACAGAGGATGATTATCAAGGTCTCCCTCTGGAATTTGGTGCCTCAGCTGAAACAGTTCGA - 3180 
-TEDDYQGLPLEFGASAETVR 

- Q R M I IKVSLWNLV PQLKQFE 

RG*LSRSPSGIWCLS*NSSS 
3181 - GTTGAGGAAGAAGAAGAGGAAGACTGGCTGGATGATACTACTGAGCAATCAGAGATTGAG - 324 0 
-VEEEEEEDWLDDTTEQSEI E 

- LRKKKRKTGWMILLSNQRLS 

* GRRRGRLAG*YY*AIRD*A 

3241 - CCAGAACCAGAACCTACACCTGAAGAACCAGTTAATCAGTTTACTGGTTATTTAAAACTT - 3300 
-PEPEPTPEEPVNQFTGYLKL 
~QNQNLHLKNQLISLLVI*NL 
RTRTYT*RTS * SVYWLFKTY 
3301 - ACTGACAATGTTGCCATTAAATGTGTTGACATCGTTAAGGAGGCACAAAGTGGTAATCCT - 3360 
-TDNVAIKCVDIVK EAQSANP 
-LTMLPLNVLTSLRRHKVLIL 
*QCCH*MC*HR*GGTKC*SY 
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3361 - ATGGTGATTGTAAATGCTGCTAACATACACCTGAAACATGGTGGTGGTGTAGCAGGTGCA - 3420 

- M V I VNAANI HLKHGGGVA GA 
-W*L*MLLTYT*NMVVV*QVH 

GDCKCC*HTPETWWWCSRCT 
3 421 - CTCAACAAGGCAACCAATGGTGCCATGCAAAAGGAGAGTGATGATTACATTAAGCTAAAT - 34 80 
-LNKATNGAMQKESDDYIKLN 
-STRQPMVPCKRRVMITLS * M 
QQGNQWCHAKGE* * LH * A K W 
3481 - GGCCCTCTTACAGTAGGAGGGTCTTGTTTGCTTTCTGGACATAATCTTGCTAAGAAGTGT - 3540 
-G PLTVGG SCLLS GHNLAKKC 
-ALLQ*EGLVCFLDI I LLRSV 
PSYSRRVLFAFWT*SC*EVS 
3541 - CTGCATGTTGTTGGACCTAACCTAAATGCAGGTGAGGACATCCAGCTTCTTAAGGCAGCA - 3600 
-LHVVGPN LNAGE D I QLLKAA 
-CMLLDLT*MQVRTSS FLRQH 
ACCWT*PKCR*GHPAS*GSI 
3 601 - TATGAAAATTTCAATTCACAGGACATCTTACTTGCACCATTGTTGTCAGCAGGCATATTT - 36 60 
-YENFNSQ DILLAPLLSAGI F 
-MKISIHRTSYLHHCCQQAYL 

* KFQFTGHLTCTIVVSRHIW 

3 661 - GGTGCTAAACCACTTCAGTCTTTACAAGTGTGCGTGCAGACGGTTCGTACACAGGTTTAT - 3720 

- G A K P LQS LQVCVQTVRT QVY 
-VLNHFSLYKCACRRFVHRFI 

C*TTSVFTSVRADGSYTGLY 
3721 - ATTGCAGTCAATGACAAAGCTCTTTATGAGCAGGTTGTCATGGATTATCTTGATAACCTG - 3780 
-I AVN DKALYEQVVMDYLDNL 
-LQSMTKLFMSRLSWIILIT* 
CSQ*QSSL*AGCHGLS**PE 
3781 - AAGCCTAGAGTGGAAGCACCTAAACAAGAGGAGCCACCAAACACAGAAGATTCCAAAACT - 3840 
-KPRVEAPKQEEPPNTEDSKT 
-SLEWKHLNKRSHQTQKIPKL 
A*SGST*TRGATKHRRFQN* 
3 841 - GAGGAGAAATCTGTCGTACAGAAGCCTGTCGATGTGAAGCCAAAAATTAAGGCCTGCATT - 3900 
-EEKSVVQKPVDVKPKIKACI 
-RRNLSYRSLSM* SQKLRPAL 
GEICRTEACRCEAKN*GLH* 
3901 - GATGAGGTTACCACAACACTGGAAGAAACTAAGTTTCTTACCAATAAGTTACTCTTGTTT - 3960 
-DEVTTTLEETKFLTNKLLLF 
-MRLPQH WKKLS FLPI SYS CL 
*GYHNTGRN*VSYQ*VTLVC 

3 961 - GCTGATATCAATGGTAAGCTTTACCATGATTCTCAGAACATGCTTAGAGGTGAAGATATG - 402 0 

-ADINGKLYHDSQNMLRGE DM 
-LI SMVSFTMILRTCLEVKIC 

* YQW*ALP* F S E H A * R * R Y V 

4021 - TCTTTCCTTGAGAAGGATGCACCTTACATGGTAGGTGATGTTATCACTAGTGGTGATATC - 4080 

- S FLEKDAPYMVGDVI T S GDI 
-LSLRRMHLTW*VMLSLVVIS 

FP*EGCTLHGR*CYH*W*YH 

4 081 - ACTTGTGTTGTAATACCCTCCAAAAAGGCTGGTGGCACTACTGAGATGCTCTCAAGAGCT - 4140 

-TCVVI PS KKAGGTTEML SRA 

- L V L * YPPKRLVALLRCSQEL 

LCCNTLQKGWWHY* DALKSF 
4141 ~ TTGAAGAAAGTGCCAGTTGATGAGTATATAACCACGTACCCTGGACAAGGATGTGCTGGT - 4200 
-LKKVPVDEY I TTYPGQGCAG 
-*RKCQLMSI *PRTLDKDVLV 

EESAS**VYNHVPWTRMCWL 
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4201 - TATACACTTGAGGAAGCTAAGACTGCTCTTAAGAT^TGCAAATCTGCATTTTATGTAGTA - 42 60 
-YTLEEAKTALKKCKSAFYVL 

- IHLRKLRLLLRNANLHFMYY 

YT*GS*DCS*EMQICILCTT 
4261 - CCTTCAGAAGCACCTAATGCTAAGGAAGAGATTCTAGGAACTGTATCCTGGAATTTGAGA - 4320 
-PSEAPNAKEEILGTVSWNLR 
-LQKHLMLRKRF*ELYPGI*E 
F R S T * C * GRDSRNCILEFER 
4321 - GAAATGCTTGCTCATGCTGAAGAGACAAGAAAATTAATGCCTATATGCATGGATGTTAGA - 4380 
-EMLAHAEETRKLMP ICMDVR 
-KCLLMLKRQEN*CLYAWMLE 
NACSC*RDKKINAYMHGC*S 
4 381 - GCCATAATGGCAACCATCCAACGTAAGTATAAAGGAATTAAAATTCAAGAGGGCATCGTT - 44 40 
-AIMATIQRKYKG IKIQEGIV 
P*WQPSNVSIKELKFKRASL 
HNGNHPT*V*RN*NSRGHR* 
4441 - GACTATGGTGTCCGATTCTTCTTTTATACTAGTAAAGAGCCTGTAGCTTCTATTATTACG - 4500 
-DYGVRFFFYTSKEPVASI IT 
-TMVS DSSFILVKSL* LLLLR 
LWCPILLLY**RACSFYYYE 
4501 - AAGCTGAACTCTCTAAATGAGCCGCTTGTCACAATGCCAATTGGTTATGTGACACATGGT - 4560 
-KLNS LNEPLVTMP I GYVTHG 

- S *TL*MSRLSQCQLVM*HMV 

AELSK*AACHNANWLCDTWF 
4 561 - TTTAATCTTGAAGAGGCTGCGCGCTGTATGCGTTCTCTTAAAGCTCCTGCCGTAGTGTCA - 4620 
-FNLEEAARCMRS LKAPAVVS 

- LILKRLRAVCVLLKLLP*CQ 

*S*RGCALYAFS*SSCRSVS 
4 621 - GTATCATCACCAGATGCTGTTACTACATATAATGGATACCTCACTTCGTCATCAAAGACA - 4680 

- V S S P DAVTTYNGYLTS S SKT 

- YHHQMLLLHIMDT SLRHQRH 

IITRCCYYI*WIPHFVIKDI 
4 681 - TCTGAGGAGCACTTTGTAGAAACAGTTTCTTTGGCTGGCTCTTACAGAGATTGGTCCTAT - 4740 
-SEEHFVETVSLAGSYRDWSY 
-LRSTL*KQFLWLALTEIGPI 
* GALCRNSFFGWLLQRIiVLF 
4 741 - TCAGGACAGCGTACAGAGTTAGGTGTTGAATTTCTTAAGCGTGGTGACAAAATTGTGTAC - 4 800 

- S GQRTELGVEFLKRGDKIVY 
-QDSVQS*VLNFLSVVTKLCT 

RTAYRVRC*IS*AW*Q-NCVP 
4801 - CACACTCTGGAGAGCCCCGTCGAGTTTCATCTTGACGGTGAGGTTCTTTCACTTGACAAA - 4860 

- H T L E S P VE FHL D GE V Ii S LD K 
-TLWRAPSSFILTVRFFHLTN 

HSGEPRRVSS*R*GSFT*QT 
4 8 61 - CTAAAGAGTCTCTTATCCCTGCGGGAGGTTAAGACTATAAAAGTGTTCACAACTGTGGAC - 4 92 0 
-LKSLLSLREVKT IKVFTTVD 
*RVSYPCGRLRL*KCSQLWT 
KESLIPAGG*DYKSVHNCGQ 
4 921 - AACACTAATCTCCACACACAGCTTGTGGATATGTCTATGACATATGGACAGCAGTTTGGT - 4980 

- N TNLHTQLVDMS MTY GQQFG 
-TLISTHSLWICL*HMDSSLV 

H*SPHTACGYVYDIWTAVWS 
4 981 - CCAACATACTTGGATGGTGCTGATGTTACAAAAATTAAACCTCATGTAAATCATGAGGGT - 5040 
-PTYLDGADVTKI KPHVNHEG 
-QHTWMVLMLQKLNLM* IMRV 

NILGWC*CYKN*TSCKS*G* 
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5041 - AAGACTTTCTTTGTACTACCTAGTGATGACACACTACGTAGTGAAGCTTTCGAGTACTAC - 5100 
-KTFFVLPSDDTLRSEAFEYY 
R L S LYYLVMTHYVVKLS S TT 
D F L C T T * * * H T T * * SFRVLP 
5101 - CATACTCTTGATGAGAGTTTTCTTGGTAGGTACATGTCTGCTTTAAACCACACAAAGAAA - 5160 
-HTLDESFLGRYMSALNHTKK 

- ILLMRVFLVGTCLL* TTQRN 

YS**EFSW*VHVCFKPHKEM 
5161 - TGGAAATTTCCTCAAGTTGGTGGTTTAACTTCAATTAAATGGGCTGATAACAAXTGTTAT - 5220 
-WKFPQVGGLTSI KWADNNCY 

- GNFLKLVV*LQLNGLITIVI 

EISSSWWFNFN*MG* * QLLF 
5221 - TTGTCTAGTGTTTTATTAGCACTTCAACAGCTTGAAGTCAAATTCAATGCACCAGCACTT - 528 0 

- L S SVLLALQQLEVKFNA PAL 
-CLVFY*HFNSLKSNSMHQHF 

V*CFISTSTA*SQIQCTSTS 
5281 - CAAGAGGCTTATTATAGAGCCCGTGCTGGTGATGCTGCTAACTTTTGTGCACTCATACTC - 534 0 
-QEAYYRARAGDAAN FCAL I L 

- K R L I IEPVLVMLLTFVHSYS 

RGLL*SPCW*CC*LLCTHTR 
5341 - GCTTACAGTAATAAAACTGTTGGCGAGCTTGGTGATGTCAGAGAAACTATGACCCATCTT - 5400 
-AYSNKTVGELGDVRETMTHL 
-LTVIKLLASLVMSEKL*P IF 
LQ**NCWRAW*CQRNYDPSS 
5401 - CTACAGCATGCTAATTTGGAATCTGCAAAGCGAGTTCTTAATGTGGTGTGTAAACATTGT - 5460 
-LQHANLE SAKRVLNVVCKHC 
YSMLIWNLQSEFLMWCVNIV 
TAC*FGICKASS*CGV*TLW 
5461 - GGTCAGAAAACTACTACCTTAACGGGTGTAGAAGCTGTGATGTATATGGGTACTCTATCT - 5520 
-GQKTTTLTGVEAVMYMG TLS 

- VRKLLP*RV*KL*CIWVLYL 

SENYYLNGCRSCDVYGYSIL 
5521 - TATGATAATCTTAAGACAGGTGTTTCCATTCCATGTGTGTGTGGTCGTGATGCTACACAA - 558 0 
-YDNLKTGVSIPCVCGRDATQ 
-MI I LRQVFPFHVCVVVMLHN 
**S*DRCFHSMCVWS*CYTI 
5581 - TATCTAGTACAACAAGAGTCTTCTTTTGTTATGATGTCTGCACCACCTGCTGAGTATAAA - 5640 
-YLVQQES S FVMM SAP PAEYK 
-I*YNKSLLLL*CLHHLLSIN 
SSTTRVFFCYDVCTTC*V*I 
5641 - TTACAGCAAGGTACATTCTTATGTGCGAATGAGTACACTGGTAACTATCAGTGTGGTCAT - 5700 
-LQQGTFLCANEY TGNYQCGH 
-YSKVHSYVRMSTLVTISVVI 
TARYILMCE*VHW*LSVWSL 
5701 - TACACTCATATAACTGCTAAGGAGACCCTCTATCGTATTGACGGAGCTCACCTTACAAAG - 57 60 
-YTHITAKETLYRIDGAHLTK 

- T L I *LLRRPSIVLTEIiTLQR 

HSYNC*GDPLSY*RSSPYKD 
57 61 - ATGTCAGAGTACAAAGGACCAGTGACTGATGTTTTCTACAAGGAAACATCTTACACTACA - 5820 
-MSEYKGPVTDVFYKETSYTT 
-CQSTKDQ*LMFSTRKHLTLQ 
VRVQRTSD*CFLQGWILHYN 
5821 - ACCATCAAGCCTGTGTCGTATAAACTCGATGGAGTTACTTACACAGAGATTGAACCAAAA - 588 0 

- T IKPVSYKLDGVTYTEI EPK 

- PSSLCRINSMELLTQRLNQN 

HQACVV*TRWSYLHRD*TKI. 
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5881 - TTGGATGGGTATTATA/^AAAGGATAATGCTTACTATACAGAGCAGCCTATAGACCTTGTA - 5940 
-LDGYYKKDNAYYTEQP I DLV 

- W M G I IKRIMLTIQSSL*TLY 

GWVL*KG* CLLYRAAYRPCT 
5941 - CCAACTCAACCATTACCAAATGCGAGTTTTGATAATTTCAAACTCACATGTTCTAACACA - 6000 
-PTQPLPNAS FDN FKLTCSNT 
-QLNHYQMRVLI ISNSHVLTQ 
NSTITKCEF* *FQTHMF*HK 
6001 - AAATTTGCTGATGATTTAAATCAAATGACAGGCTTCACAAAGCCAGCTTCACGAGAGCTA ~ 6060 
-KFADDLNQMTGFTKPASREL 
-NLLMI*IK*QASQSQLHESY 
IC**FKSNDRLHKASFTRAI 
6061 - TCTGTCACATTCTTCCCAGACTTGAATGGCGATGTAGTGGCTATTGACTATAGACACTAT - 6120 
-SVTFFPDLNGDVVAI DYRHY 
-LSHSSQT*MAM*WLLTIDTI 
CHILPRLEWRCSGY*L*TLF 
6121 - TCAGCGAGTTTCAAGAAAGGTGCTAAATTACTGCATAAGCCAATTGTTTGGCACATTAAC - 6180 
-SAS FKKGAKLLHKP IVWHIN 
-QRVSRKVLNYCISQLFGTLT 
SEFQERC* ITA*ANCLAH*P 
6181 - CAGGCTAGAACCAAGACAACGTTCAAACCAAACACTTGGTGTTTACGTTGTCTTTGGAGT - 6240 
-QATTKTTFKPNTWCLRCLWS. 
-RLQPRQRSNQTLGVYVVFGV 
GYNQDNVQTKHLVFTLSLEY 
6241 ~ ACAAAGCCAGTAGATACTTCAAATTCATTTGAAGTTCTGGCAGTAGAAGACACACAAGGA - 6300 
-TKPVDTSN S FEVLAVE DTQG 
-QS Q * ILQIHLKFWQ*KTHKE 
KASRYFKFI * SSGSRRHTRN 
6301 - ATGGACAATCTTGCTTGTGAAAGTCAACAACCCACCTCTGAAGAAGTAGTGGAAAATCCT - 6360 

- M DNLACESQQPT SEEVVENP 
-WT ILLVKVNNPPLKK* WKIL 

GQSCL*KSTTHL* RSSGKSY 
6361 - ACCATACAGAAGGAAGTCATAGAGTGTGACGTGAAAACTACCGAAGTTGTAGGCAATGTC - 6420 
-TIQKEVIECDVKTTEVVGNV 

- PYRRKS* 5VT*KLPKL*AMS 

HTEGSHRV*RENYRSCRQCH 
6421 - ATACTTAAACCATCAGATGAAGGTGTTAAAGTAACACAAGAGTTAGGTCATGAGGATCTT - 6480 
-I LKPS DEGVKVTQELGHE DL 
-YLNHQMKVLK*HKS*VMRII> 
T*TIR*RC*SNTRVRS*GSY 
64 81 - ATGGCTGCTTATGTGGAAAACACAAGCATTACCATTAAGAAACCTAATGAGCTTTCACTA - 654 0 
-MAAYVENTS ITIKKPNELSL 
-WLLMWKTQALPLRNLMSFH* 
GCLCGKHKHYH*ET**AFTS 
6541 - GCCTTAGGTTTAAAAACAATTGCCACTCATGGTATTGCTGCAATTAATAGTGTTCCTTGG - 6600 
-ALGLKT IATHGIAAINSVPW 

- P*V*KQLPLMVLLQLI VFLG 

LRFKNNCHSWYCCN* * C S L E 
6601 - AGTAAAATTTTGGCTTATGTCAAACCATTCTTAGGACAAGCAGCAATTACAACATCAAAT - 6660 
-SKILAYVKPFLGQAAI TTSN 

- VKFWLMSNHS* DKQ QLQHQI 

*NFGLCQTILRTS SNYNIKL 
66 61 - TGCGCTAAGAGATTAGCACAACGTGTGTTTAACAATTATATGCCTTATGTGTTTACATTA - 6720 
-CAKRLAQRVFNNYMPYVFTL 

- A L R D * HNVCLT I ICLMCLHY 

R*EISTTCV*QLYALCVYII 
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6721 - TTGTTCCAATTGTGTACTTTTACTAAAAGTACCAATTCTAGAATTAGAGCTTCACTACCT - 6780 
-LFQLCTFTKSTNSRIRASLP 
-CSNCVLLLKVP ILELELHYL 
VPIVYFY*KYQF*N*SFTTY 
6781 - ACAACTATTGCTAAAAATAGTGTTAAGAGTGTTGCTAAATTATGTTTGGATGCCGGCATT - 6840 
-TTIAKNSVKSVAKLCL D A G I 
-QLLLKIVLRVLLNYVWMPAL 
NYC*K*C*ECC*IMFGCRH* 
6841 - AATTATGTGAAGTCACCCAAATTTTCTAAATTGTTCACAATCGCTATGTGGCTATTGTTG - 6900 
-NYVKS PKFSKLFT IAMWLLL 
-IM*SHPNFLNCSQSLCGYCC 
LCEVTQIF*IVHNRYVAIVV 
6901 - TTAAGTATTTGCTTAGGTTCTCTAATCTGTGTAACTGCTGCTTTTGGTGTACTCTTATCT - 6960 
-LS ICLGSLI CVTAAFGVLLS 
-*VFA*VL*SV*LLLLVYSYL 
KYLLRFSNLCNCCFWCTLI* 

6 9 61 - AATTTTGGTGCTCCTTCTTATTGTAATGGCGTTAGAGAATTGTATCTTAATTCGTCTAAC - 7020 

-NFGAPSY CNGVRELYLNSSN 
-ILVLLLIVMALENCILIRLT 
FWCSFLL*WR*RIVS*FV*R 

7 021 - GTTACTACTATGGATTTCTGTGAAGGTTCTTTTCCTTGCAGCATTTGTTTAAGTGGATTA - 7 08 0 

-VTTMDFCEGSFPCSICLSGL 
-LLLWISVKVLFLAAFV*VD* 
YYYGFL*RFFSLQHLFKWIR 
7 081 - GACTCCCTTGATTCTTATCCAGCTCTTGAAACCATTCAGGTGACGATTTCATCGTACAAG - 714 0 
-DSLDSYPALETIQVTISSYK 
~TPLILIQLLKPFR*RFHRTS 
LP*FLSSS*NHSGDDFIVQA 
7141 - CTAGACTTGACAATTTTAGGTCTGGCCGCTGAGTGGGTTTTGGCATATATGTTGTTCACA - 7 200 
-LDLTI LGLAAEWVLAYMLFT 
-*T*QF*VWPLSGFWHICCSQ 
RLDNFRSGR*VGFGIYVVHK 
7201 - AAATTCTTTTATTTATTAGGTCTTTCAGCTATAATGCAGGTGTTCTTTGGCTATTTTGCT - 72 60 
-KFFYLLGL SAIMQVFFGYFA 
-NSFIY*VFQL*CRCSLAILL 
ILLFIRSFSYNAGVLWLFC* 
7261 - AGTCATTTCATCAGCAATTCTTGGCTCATGTGGTTTATCATTAGTATTGTACAAATGGCA - 7320 
-SHFISNSWLMWFI I S I V Q M A 
-VISSAILGSCGLSLVLYKWH 
SFHQQFLAHVVYH*YCTNGT 
7321 - CCCGTTTCTGCAATGGTTAGGATGTACATCTTCTTTGCTTCTTTCTACTACATATGGAAG - 738 0 
-PVSAMVRMYI FFASFYYIWK 
-PFLQWLGCTSSLLLSTTYGR 
R F C N G * DVHLLCFFLLHMEE 
7381 - AGCTATGTTCATATCATGGATGGTTGCACCTCTTCGACTTGCATGATGTGCTATAAGCGC - 744 0 
-SYVHIMDGCTSSTCMMCYKR 
-AMFISWMVAPLRI,A*CAISA 
LCSYHGWLHLFDLHDVL*AQ 
74 41 - AATCGTGCCACACGCGTTGAGTGTACAACTATTGTTAATGGCATGAAGAGATCTTTCTAT - 7 500 
-NRATRVECTT IVNGMKRSFY 
- IVPHALSVQLLLMA*RDLSM 
SCHTR*VYNYC*WHEEIFLC 
7501 - GTCTATGCAAATGGAGGCCGTGGCTTCTGCAAGACTCACAATTGGAATTGTCTCAATTGT - 7560 
-VYANGGRGFCKTHNWNCLNC 
-SMQMEAVASARLTIGIVSIV 
LCKWRPWLLQDSQLEliSQL* 
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7561 - GACACATTTTGCACXGGTAGTACATTCATTAGTGATGAAGTTGCTCGTGATTTGTCACTC - 7 620 
-DTFCTGSTFISDEVARDLSL 

- THFALVVHSLVMKLLVICHS 

HILHW*YIH***SCS*FVTP 
7621 - CAGTTTAAAAGACCAATCAACCCTACTGACCAGTCATCGTATATTGTTGATAGTGTTGCT - 7680 
-QFKRPINPTDQSSYIVDSVA 

- SLKDQSTLLTSHRILLIVLL 

V *KTNQPY*PVIVYC**CCC 
7681 - GTGAAAAATGGCGCGCTTCACCTCTACTTTGACAAGGCTGGTCAAAAGACCTATGAGAGA - 7740 
-VKNGALHLYFDKAGQKTYER 

- * KMARFTSTLTRLVKRPMRD 

EKWRAS PLL*QGWSKDL*ET 
7741 - CATCCGCTCTCCCATTTTGTCAATTTAGACAATTTGAGAGCTAACAACACTAAAGGTTCA ~ 7800 
-HPLS HFVNLDNLRANNTKGS 
-IRSPILSI*TI*ELTTIiKVH 
SALPFCQFRQFES*QH * R F T 
7801 - CTGCCTATTAATGTCATAGTTTTTGATGGCAAGTCCAAATGCGACGAGTCTGCTTCTAAG - 78 60 
-LP INVIVFDGKSKCDESASK 
-CLLMS*FLMASPNATSLLLS 
AY*CHSF*WQVQMRRVCF*V 
7861 - TCTGCTTCTGTGTACTACAGTCAGCTGATGTGCCAACCTATTCTGTTGCTTGACCAAGCT - 7920 
-SASVYYSQLMCQP ILLLDQA 
-LLLCTTVS*CANLFCCLTKL 
CFCVLQSADVPTYSVA*PSS 
7 921 - CTTGTATCAAACGTTGGAGATAGTACTGAAGTTTCCGTTAAGATGTTTGATGCTTATGTC - 798 0 
-LVSNVGDSTEVSVKMFDAYV 
-LYQTLEIVLKFPLRCLMLMS 
-'CIKRWR*Y*SFR*DV*CLCR 
7 981 - GACACCTTTTCAGCAACTTTTAGTGTTCCTATGGAAAAACTTAAGGCACTTGTTGCTACA - 8040 
-DTFSATFSVPMEKLKALVAT 

- TPFQQLLVFLWKNLRHLLLQ 

HLFSNF*CSYGKT*GTCCYS 
8041 - GCTCACAGCGAGTTAGCAAAGGGTGTAGCTTTAGATGGTGTCCTTTCTACATTCGTGTCA - 810 0 
-AHSELAKGVALDGVLST FVS 
-LTAS * Q R V * L*MVSFLHSCQ 
SQRVSKGCSFRWCPFYIRVS 
8101 - GCTGCCCGACAAGGTGTTGTTGATACCGATGTTGACACAAAGGATGTTATTGAATGTCTC - 8160 
-AARQGVV DTDVDTKDVIECL 
-LPDKVLL IPMLTQRMLLNVS 
CPTRCC*YRC*HKGCY*MSQ 
8161 - AAACTTTCACATCACTCTGACTTAGAAGTGACAGGTGACAGTTGTAACAATTTCATGCTC - 8220 
-KLSHHS DLEVTG DSCNN FML 
-NFHITLT*K*QVTVVTISCS 
TFTSL*LRSDR*QL*QFHAH 
8221 - ACCTATAATAAGGTTGAAAACATGACGCCCAGAGATCTTGGCGCATGTATTGACTGTAAT - 8280 
-TYNKVENMTPRDLGACI DCN 
-PI IRLKT*RPEILAHVLTVM 
L * *G*KHDAQRSWRMY*L*C 
82 81 - GCAAGGCATATCAATGCCCAAGTAGCAAAAAGTCACAATGTTTCACTCATCTGGAATGTA - 8340 

- A R H I NAQVAKS HNVS LIWNV 
-QGISMPK*QKVTMFHSSGM* 

KAYQCPSSKKSQCFTHLECK 
8341 - AAAGACTACATGTCTTTATCTGAACAGCTGCGTAAACAAATTCGTACTGCTGCCAAGAAG - 8 4 00 

- K D Y M SLS EQLRKQIRTAAKK 
-KTTCLYLNSCVNKFVLLPRR 

RLHVFI*TAA* TNSYCCQEE 
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8401 - AACAACATACCTTTTACACTAACTTGTGCTACAACTAGACAGGTTGTCAATGTCATAACT - 8460 

- N N I PFTLTCATTRQVVNVIT 
-TTYLLH*LVLQLDRLSMS*Ij 

QHTFYTNLCYN*TGCQCHNY 
8461 - ACTAAAATCTCACTCAAGGGTGGTAAGATTGTTAGTACTTGTTTTAAACTTATGCTTAAG - 8520 

- T K I SLKGGKIVSTCFKLMLK 
-LKSHSRVVRLLVLVLNLCLR 

*NLTQGW*DC*YLF*TYA*G 
8521 ~ GCCACATTATTGTGCGTTCTTGCTGCATTGGTTTGTTATATCGTTATGCGAGTACATACA - 858 0 
-ATLLCVLAALVCY IVM1PVHT 
-PHYCAFLLHWFVISLCQYIH 
HIIVRSCCIGLLYRYASTYI 
8581 - TTGTCAATCCATGATGGTTACACAAATGAAATCATTGGTTACAAAGCCATTCAGGATGGT - 8 64 0 
-LS I H DGYTNEI I GYKAI Q DG 
-CQSMMVTQMKSLVTKPFRMV 
VNP*WLHK*NHWLQSHSGWC 
8 641 - GTCACTCGTGACATCATTTCTACTGATGATTGTTTTGCAAATAAACATGCTGGTTTTGAC - 8700 
-VTRD I I STDDCFANKHAGFD 

- SLVTSFLLMIVLQINMLVLT 

HS*HHFY**LFCK*TCWF*R 
8701 - GCATGGTTTAGCCAGCGTGGTGGTTCATACAAAAATGACAAAAGCTGCCCTGTAGTAGCT - 8760 
-AWFSQRGGSYKNDK SCPVVA 
-HGLASVVVHTKMTKAAL* * L 
MV*PAWWFIQK*QKLPCSSC 
8761 - GCTATCATTACAAGAGAGATTGGTTTCATAGTGCCTGGCTTACCGGGTACTGTGCTGAGA - 8820 

- A I ITRE IGFIVPGLPGTVLR 
-LSLQERLVS*CIjAYRVLC*E 

YHYKRDWFHSAWLTGYCAES 
8 821 - GCAATCAATGGTGACTTCTTGCATTTTCTACCTCGTGTTTTTAGTGCTGTTGGCAACATT - 8880 
-AINGDFLRFLPRVFSAVGNI 
-QSMVTSCIFYLVFLVLLATF 
N Q W * LLAFSTSCF*CCWQHL 
8 8 81 - TGCTACACACCTTCCAAACTCATTGAGTATAGTGATTTTGCTACCTCTGCTTGCGTTCTT - 8 94 0 

- C Y T PSKLIEYSDFATSACVL 

- ATHLPNSLSIVILLPLLAFL 

LHTFQTH*V** FCYLCLRSC 
8941 - GCTGCTGAGTGTACAATTTTT7VAGGATGCTATGGGCAAACCTGTGCCATATTGTTATGAC - 9000 
-AAECTI FKDAMGKPVPYCYD 
LLSVQFLRMLWANLCHIVMT 
C*VYNF*GCYGQTCAILL*H 
9001 - ACTAATTTGCTAGAGGGTTCTATTTCTTATAGTGAGCTTCGTCCAGACACTCGTTATGTG - 9060 
-TNLLEGS I SYSELRPDTRYV 
-LIC*RVLFLIVSFVQTLVMC 
* FARGFYFL* * A S SRHSLCA 
9061 - CTTATGGATGGTTCCATCATACAGTTTCCTAACACTTACCTGGAGGGTTCTGTTAGAGTA - 9120 
-LMDGS I IQFPNTYLBGSVRV 

- LWMVPSYSFLTLFWRVLLE* 

YGWFHHTVS *HLPGGFC*SS 
9121 - GTAACAACTTTTGATGCTGAGTACTGTAGACATGGTACATGCGAAAGGTGAGAAGTAGGT - 918 0 

- V T T FDAEY CRHGTCERS EVG 

*QLLMLSTVDMVHAKGQK*V 
N N F * C*VL*TWYMRKVRSRY 
9181 - ATTTGCCTATCTACCAGTGGTAGATGGGTTCTTAATAATGAGCATTACAGAGCTCTATCA - 924 0 
-I CLS TSGRWVLNNEHYRALS 
FAYLPVVDGFLIMSITELYQ 
LPIYQW*MGS***ALQSSIR 
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9241 - GGAGTTTTCTGTGGTGTTGATGCGATGAATCTCATAGCTAACATCTTTACTCCTCTTGTG - 9300 
-GVFCGV DAMNLIANI FT PLV 
~EFSVVLMR*IS*LTSLLLLC 
SFLWC±CDESHS*HLYSSCA 
9301 ~ CAACCTGTGGGTGCTTTAGATGTGTCTGCTTCAGTAGTGGCTGGTGGTATTATTGCCATA - 9360 
-QPVGALDVSASVVAGGI IAI 
~NLWVL*MCLLQ*WLVVLLPY 
TCGCFRCVCFSSGWWYYCHI 
9361 - TTGGTGACTTGTGCTGCCTACTACTTTATGAAATTCAGACGTGTTTTTGGTGAGTACAAC - 9420 
-LVTCAAYYFMKFRRVFGEYN 
-W*LVLPTTL*NSDVFLVSTT 
GDLCCLLLYE IQTCFW * V Q P 
9421 - CATGTTGTTGCTGCTAATGCACTTTTGTTTTTGATGTCTTTCACTATACTCTGTCTGGTA - 94 8 0 
-HV'VAANALLFLMSFTILCLV 
-MLLLLMHFCF*CLSLYSVWY 
C C C C * CTFVFDVFHYTLSGT 
9481 - CCAGCTTACAGCTTTCTGCCGGGAGTCTACTCAGTCTTTTACTTGTACTTGACATTCTAT - 9540 
-PAYSFLPGVYSVFYLYLTFY 
-QLTAFCRESTQSFTCT*HSI 
SLQLSAGSLLSLLLVLDILF 
9541 - TTCACCAATGATGTTTCATTCTTGGCTCACCTTCAATGGTTTGCCATGTTTTCTCCTATT - 9600 
-FTNDVSFLAHLQWFAMFSPI 
-SPMMFHSWLTFNGLPCFLLL 
HQ*CFILGSPSMVCHVFSYC 
9601 - GTGCCTTTTTGGATAACAGCAATCTATGTATTCTGTATTTCTCTGAAGCACTGCCATTGG - 9660 
-VPFWITAIYVFCISLKHCHW 
-CLFG*QQSMYSVFL*STAIG 
AFLDNSNLCILYFSEALPLV 
9661 - TTCTTTAACAACTATCTTAGGAAAAGAGTCATGTTTAATGGAGTTACATTTAGTACCTTC - 9720 
-FFNNYLRKRVMFNGVT FSTF 
-SLTTILGKESCLMELHLVPS 
L*QLS*EKSHV*WSYI*YLR 
9721 - GAGGAGGCTGCTTTGTGTACCTTTTTGCTCAACAAGGAAATGTACCTAAAATTGCGTAGC - 9780 
-EEAALCTFLLNKEMYLKLRS 
-RRLLCVPFCSTRKCT*NCVA 
GGCFVYLFAQQGNVPKIA*R 
97 81 - GAGACACTGTTGCCACTTACACAGTATAACAGGTATCTTGCTCTATATAACAAGTACAAG - 984 0 
-ETLLPLTQYNRYLALYNKYK 
-RHCCHLHSITGILLYITSTS 
DTVATYTV*QVSCSI*QVQV 
9841 - TATTTCAGTGGAGCCTTAGATACTACCAGCTATCGTGAAGCAGCTTGCTGCCACTTAGCA - 9900 
-YFS GALDTTSYREAACCHLA 
-ISVEP*ILPAIVKQLAAT*Q 
FQWSLRYYQLS*SSLLPLSK 
9901 - AAGGCTCTAAATGACTTTAGCAACTCAGGTGCTGATGTTCTCTACCAAGCACCACAGACA - 9960 
-KALNDFSNSGADVLYQP PQT 
-RL*MTLATQVLMFSTNHHRH 
GSK*L*QLRC*CSLPTTTDI 
9961 - TCAATCACTTCTGCTGXTCTGCAGAGTGGTTTTAGGAAAATGGCATTCCCGTCAGGCAAA - 10020 
-SITSAVLQSGFRKMAFPSGK 
-QSLLLFCRVVLGKWHSRQAK 
NHFCCSAEWF*ENGIPVRQS 
10021 - GTTGAAGGGTGCATGGTACAAGTAACCTGTGGAACTACAACTCTTAATGGATTGTGGTTG - 10080 
-VEGCMVQVTCGTTTLNG LWL 
-LKGAWYK* PVELQLLMDCGW 
*RVHGTSNLWNYNS*WIVVG 
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10081 - GATGACACAGTATACTGTCCAAGACATGTCATTTGCACAGCAGAAGACATGCTTAATCCT - 10140 
-DDTVYCPRHVICTAEDMLNP 

- MTQYTVQDMS FAQQKTCLIL 

* HSILSKTCHLHSRRHA*S * 
10141 - AACTATGAAGATCTGCTCATTCGCAAATCCAACCATAGCTTTCTTGTTCAGGCTGGCAAT - 10200 
-NYEDLLI RKSNHS FLVQAGN 

- T M K I CS FANPTIAFLFRLAM 

L*RSAHSQIQP*LSCSGWQC 
10201 - GTTCAACTTCGTGTTATTGGCCATTCTATGCAAAATTGTCTGCTTAGGCTTAAAGTTGAT - 10260 
-VQLRVIGHSMQNCLLRLKVD 
-FNFVLLAILCKIVCLGLKLI 
STSCYWPFYAKLSA*A*S*Y 
10261 - ACTTCTAACCCTAAGACACCCAAGTATAAATTTGTCCGTATCCAACCTGGTCAAACATTT - 10320 
-TSNPKTPKYKFVRIQPGQTF 
-LLTLRHPSINLSVSNLVKHF 
F*P*DTQV*ICPYPTWSNIF 
10321 - TCAGTTCTAGCATGCTACAATGGTTCACCATCTGGTGTTTATCAGTGTGCCATGAGACCT - 10380 
-SVLACYNGS PSGVYQCAMRP 
-QF*HATMVHHLVFISVP* DL 
SSSMLQWFTIWCLSVCHET* 
10381 - AATCATACCATTAAAGGTTCTTTCCTTAATGGATCATGTGGTAGTGTTGGTTTTAACATT - 10440 

- N H T IKGS FLNGS CGSVG FN I 
-I I PLKVLSLMDHVVVLVLTL 

SYH*RFFP*WIMW*CWF*H* 
10441 - GATXATGATTGCGTGTCTTTCTGCTATATGCATCATATGGAGCTTCCAACAGGAGTACAC - 10500 
-DYDCVSFCYMHHMELPTGVH 
-IMIACLSAICIIWSFQQEYT 
L*LRVFLLYASYGASNRSTR 
10501 - GCTGGTACTGACTTAGAAGGTAAATTCTATGGTCCATTTGTTGACAGACAAACTGCACAG - 10560 
-AGTDLEGKFYGPFVDRQTAQ 

- LVLT * KVNSMVHLLTDKLHR 

WY*LRR*ILWSIC*QTNCTG 
10561 - GCTGCAGGTACAGACACAACCATAACATTAAATGTTTT6GCATGGCTGTATGCTGCTGTT - 10620 
-AAGT DTT I TLNVLAWLYAAV 
-LQVQTQP*H*MFWHGCMLLL 
CRYRHNHW IKC FGMAVCCCY 
10621 - ATCAATGGTGATAGGTGGTTTCTTAATAGATTCACCACTACTTTGAATGACTTTAACCTT - 10680 
-INGDRWFLNRFTTTLNDFNL 
-SMVIGGFLIDSPLL*MTLTL 
QW**VVS**IHHYFE*L*PC 
10681 - GTGGCAATGAAGTACAACTATGAACCTTTGACACAAGATCATGTTGACATATTGGGACCT ~ 10740 
-VAMKYNYE PLTQDHVDI LGP 
-WQ*STTMNL* HKIMLTYWDL 
G N E V Q L * TFDTRSC*H IGTS 
10741 - CTTTCTGCTCAAACAGGAATTGCCGTCTTAGATATGTGTGCTGCTTTG^VAAGAGCTGCTG - 10800 
-LSAQTGIAVLDMCAALKELL 
-FLLKQELPS + ICVLL^KSCC 
FCSNRNCRLRYVCCFE RAAA 
10801 - CAGAATGGTATGAATGGTCGTACTATCCTTGGTAGCACTATTTTAGAAGATGAGTTTACA. - 10860 
-QNGMNGRT ILGS T l'LE DE FT 
-RMV*MVVLSLVALF* KMSLH 
EWYEWSYYPW*HYFRR*VYT 
10861 - CCATTTGATGTTGTTAGACAATGCTCTGGTGTTACCTTCCAAGGTAAGTTCAAGAAAATT - 10920 
-PFDVVRQCSGVTFQGKFKKI 
-HLMLLDNALVLPSKVSSRKL 
I*CC*TMLWCYLPR*VQENC 
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10921 - GTTAAGGGCACTCATCATTGGATGCTTTTAACTTTCTTGACATCAGTATTGATTCTTGTT - 10980 
-VKGTHHWMLLTFLTSLLILV 

-lraliigcf*ls*hhy*flf 
*ghssldafnflditidscs 
10981 - caaagtacacagtggtcactgtttttctttgtttacgagaatgctttcttgccatttact - 11040 
-qstqwslfffvyenaflpft 
-kvhs ghcfslftrmlschll 
kytvvtvflclrecflaiys 
11041 - cttggtattatggcaattgctgcatgtgctatgctgcttgttaagcataagcacgcattc - 11100 

- l g i ma iaacamllvkh khaf 
-lvlwqllhvlcclls isths 

wyygnccmcyaac*a*aril 
11101 - ttgtgcttgtttctgttaccttctcttgcaacagttgcttactttaatatggtctacatg - 11160 
-lclfllpslatvayfnmvym 
-cacfcylllqqlltliwstc 
v'lvsvtfscnscll* yglha 
11161 - cctgctagctgggtgatgcgtatcatgacatggcttgaattggctgacactagcttgtct - 11220 

-PASWVMR IMTWLELADT SLS 
-LLAG*CVS*HGLNWLTLACL 
C*LGDAYHDMA*IG*H*LVW 
11221 - GGTTATAGGCTTAAGGATTGTGTTATGTATGCTTCAGCTTTAGTTTTGCTTATTCTCATG - 11280 
-GYRLKDCVMYASALVLL I LM 
-VIGLRIVLCMLQL* F C L F S * 
L*A*GLCYVCFSFSFAYSHD 
11281 - ACAGCTCGCACTGTTTATGATGATGCTGCTAGACGTGTTTGGACACTGATGAATGTCATT - 11340 
-TARTVYD DAARRVWTLMNVI 
-QLALFMMMLLDVFGH* * M S L 
SSHCL**CC*TCLDTDECHY 
11341 - ACACTTGTTTACAAAGTCTACTATGGTAATGCTTTAGATCAAGCTATTTCCATGTGGGCC - 11400 

- T LVYKVYYGNAL DQAI S MWA 
-HLFTKSTMVML* IKLFPCGP 

TCLQSLLW*CFRSSYFHVGL 
11401 - TTAGTTATTTCTGTAACCTCTAACTATTCTGGTGTCGTTACGACTATCATGTTTTTAGCT - 114 60 
-LVI SVTSNYSGVVTT IMFLA 

- *LFL*PLTILVSLRIjSCF*Ij 

SYFCNL*LFWCRYDYHVFS* 
11461 - AGAGCTATAGTGTTTGTGTGTGTTGAGTATTACCCATTGTTATTTATTACTGGCAACACC - 11520 
-RAI VFVCVEYY PLLFI T GNT 
-EL*CLCVLS ITHCYLLLATP 
SYSVCVC*VLPIVIYYWQHL 
11521 - TTACAGTGTATCATGCTTGTTTATTGTTTCTTAGGCTATTGTTGCTGCTGCTACTTTGGC - 11580 
-LQCIMLVYCFLGYCCCCYFG 
-YSVSCLFIVS^AIVAAA'TLA 
TVYHACLLFLRLLLLLLLWP 
11581 - CTTTTCTGTTTACTCAACCGTTACTTCAGGCTTACTCTTGGTGTTTATGACTACTTGGTC - 11640 
-LFCLLNRYFRLTLGVYDYLV 
-FSVYSTVTSGLLLVFMTTWS 
FLFTQPLLQAYSWCL* LLGL 
11641 - TCTACACAAGAATTTAGGTATATGAACTCCCAGGGGCTTTTGCCTCCTAAGAGTAGTATT - 11700 

- S TQEFRYMNSQGLLP PKS S I 
-LHKNLGI *TPRGFCLLRVVL 

YTRI *VYE. LPGAFAS * E * y * 
11701 - GATGCTTTCAAGCTTAACATTAAGTTGTTGGGTATTGGAGGTAAACCATGTATCAAGGTT - 11760 
-DAFKLNIKLLGIGGKPCIKV 
-MLSSLTLSCWVLEVNHVSRL 

CFQA*H*VVGYWR*TMYQGC 
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11761 - GCTACTGTACAGTCTAAAATGTCTGACGTAAAGTGCACATCTGTGGTACTGCTCTCGGTT - 11820 
-ATVQSKMSDVKCTSVVLLSV 
-LLY SLKCLT * SA-HLWYCSRF 
YCTV*NV*RKVHICGTALGS 

11821 - CTTCAACAACTTAGAGTAGAGTCATCTTCTAAATTGTGGGCACAATGTGTACAACTCCAC - 11880 
-LQQLRVESSSKLWAQCVQLH 

- F N N L E * SHLLNCGHNVYNST 

STT*SRVIF* IVGTMCTTPQ 
11881 - AATGATATTGTTCTTGCAAAAGACACAACTGAAGCTTTCGAGAAGATGGTTTCTCTTTTG - 11940 

- N D I LLAKDTTEAFEKMV SLL 
-MIFFLQKTQLKLSRRWFLFC 

* YSSCKRHN* SFREDGFSFV 
11941 - TCTGTTTTGCTATCCATGCAGGGTGCTGTAGACATTAATAGGTTGTGCGAGGAAATGCTC - 12000 
-SVLLSMQGAVDI NRLCEEML 
-LFCYPCRVL*"TLIGCARKCS 
CFAIHAGCCRH* *VVRGNAR 
12001 - .GATAACCGTGCTACTCTTCAGGCTATTGCTTCAGAATTTAGTTCTTTACCATCATATGCC - 12060 
-DNRATLQAIASEFSS LPSYA 
-ITVLLFRLLLQNLVLYHHMP 
*PCYSSGYCFRI*FFTIICR 
12061 - GCTTATGCCACTGCCCAGGAGGCCTATGAGCAGGCTGTAGCTAATGGTGATTCTGAAGTC - 12120 
-AYATAQEAYEQAVANGDS E V 
~LMPLPRRPMSRL*LMVILKS 
LCHCPGGL*AGCS*W*F*SR 
12121 - GTTCTCAAAAAGTTAAAGAAATCTTTGAATGTGGCTAAATCTGAGTTTGACCGTGATGCT - 12180 
-VLKKLKKSLNVAKSE FDRDA 

- FSKS*RNL*MWLNLSLTVML 

SQKVKEIFECG*I*V*P*CC 
12181 - GCCATGCAACGCAAGTTGGAAAAGATGGCAGATCAGGCTATGACCCAAATGTACAAACAG - 12240 
-AMQRKLEKMADQAMTQMYKQ 

- PCNASWKRWQIRL* PKCTNR 

HATQVGKDGRSGYDPNVQTG 
12241 - GCAAGATCTGAGGACAAGAGGGCAAAAGTAACTAGTGCTATGCAAACAATGCTCTTCACT - 12300 
-ARSEDKRAKVTSAMQTMLFT 
-QDLRTRGQK*LVLCKQCSSL 
KI *GQEGKSN* CYANNALHY 
12301 - ATGCTTAGGAAGCTTGATAATGATGCACTTAACAACATTATCAACAATGCGCGTGATGGT - 12360 
-MLRKLDNDALNNI INNARDG 
-CLGSLIMMHLTTLSTMRVMV 
A*EA***CT*QHYQQCA*WL 
12361 - TGTGTTCCACTCAACATCATACCATTGACTACAGCAGCCAAACTCATGGTTGTTGTCCCT - - 12420 
-CVPLNIIPLTTAAKLMVVVP 

- VFHSTSYH*LQQPNSWLLSL 

CSTQHHTIDYSSQTHGCCP* 
12421 - GATTATGGTACCTACAAGAACACTTGTGATGGTAACACCTTTACATATGCATCTGCACTC - 12480 
-DYGTYKNTCDGNT FT YASAL 

- IMVPTRTLVMVTPLHMHLHS 

LWYLQEHL*W*HLYIC ICTL 
12481 - TGGGAAATCCAGCAAGTTGTTGATGCGGATAGCAAGATTGTTCAACTTAGTGAAATTAAC - 12540 
-WE IQQVVDADSKI VQLSE IN 
GKSSKLLMRIARLFNLVKLT 
GNPASC*CG*QDCST**N*H 
12541 - ATGGACAATTCACCAAATTTGGCTTGGCCTCTTATTGTTACAGCTCTAAGAGCCAACTCA - 12600 

- M D N S PNLAWPLIVTALRANS 
-WTIHQIWLGLLLLQL*EPTQ 

GQFTKFGLASYCYSSKSQLS 
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12 601 - GCTGTTAAACTACAGAATAATGAACTGAGTCCAGTAGCACTACGACAGATGTCCTGTGCG - 12 660 
-AVKLQNNELSPVALRQMSCA 
-LLNYRIMN*VQ*HYDRCPVR 
C*TTE**TESSSTTTDVI J CG 
12661 - GCTGGTACCACACAAACAGCTTGTACTGATGACAATGCACTTGCCTACTATAACAATTCG - 12720 
-AGTTQTACTDDNALAYYNNS 
-LVPHKQLVLMTMHLPTITIR 
WYHTNSLY**QCTCLL*QFE 
12721 - AAGGGAGGTAGGTTTGTGCTGGCATTACTATCAGACCACCAAGATCTCAAATGGGCTAGA - 12780 
-KGGRFVLALLS DHQDL KWAR 
-REVGLCWHYYQTTKI SNGLD 
GR*VCAGITIRPPRSQMG*I 
12781 - TTCCCTAAGAGTGATGGTACAGGTACAATTTACACAGAACTGGAACCACCTTGTAGGTTT - 12840 

- F P K S DGTGT IYTELEP PCRF 

S LRVMVQVQFTQNWNHLVGL 
P*E*WYRYNLHRTGTTL*VC 
12841 - GTTACAGACACACCAAAAGGGCCTAAAGTGAAATACTTGTACTTCATCAAAGGCTTAAAC - 12900 
-VT DT PKGPKVKYLY FI KGLN 

- LQTHQKGLK*NTCTSSKA*T 

YRHTKRA* SEILVLHQRLKQ 
12901 - AACCTAAATAGAGGTATGGTGCTGGGCAGTTTAGCTGCTACAGTACGTCTTCAGGCTGGA - 12960 
-NLNRGMVLG SLAATVRLQAG 

- T * IEVWCWAV* LLQYVFRLE 

P K * RYGAGQFSCYSTSSGWK 
12961 - AATGCTACAGAAGTACCTGCCAATTCAACTGTGCTTTCCTTCTGTGCTTTTGCAGTAGAC - 13020 
-NATEVPANS TVLSFCAFAVD 
-MLQKYLPIQLCFPSVLLQ*T 
CYRSTCQFNCAFLLCFCSRP 
13021 - CCTGCTAAAGCATATAAGGATTACCTAGCAAGTGGAGGACAACCAATCACCAACTGTGTG - 13080 
-PAKAYKDYLASGGQPITNCV 
-LLKHIRIT*QVEDNQSPTV* 
C * S I *GLPSKWRTTNHQLCE 
13081 - AAGATGTTGTGTACACACACTGGTACAGGACAGGCAATTACTGTAACACCAGAAGCTAAC - 13140 
-KMLCTHTGTGQAI TVTPEAN 
RCCVHTLVQDRQLL* HQKLT 
DVVYTHWYRTGNYCNTRS*H 
13141 - ATGGACCAAGAGTCCTTTGGTGGTGCTTCATGTTGTCTGTATTGTAGATGCCACATTGAC - 13200 
-MDQESFGGASCCLYCRCHID 
-WTKSPLVVLHVVCIVDATLT 
GPRVLWWCFMLSVL*MPH*P 
13201 - CATCCAAATCCTAAAGGATTCTGTGACTTGAAAGGTAAGTAGGTCCAAATACCTACCACT - 13260 
-HPNPKGFCDLKGKYVQIPTT 

- IQILKDSVT*KVSTSKYLPL 

SKS*RIL*LER*VRPNTYHL 
132 61 - TGTGCTAATGACCCAGTGGGTTTTACACTTAGAAACACAGTCTGTACCGTCTGCGGAATG - 13320 
-CAN DPVGFTLRN TVCTVCGM 
-VLMTQWVLHLETQSVPSAEC 
C * *PSGFYT*KHSLYRLRNV 
13321 - TGGAAAGGTTATGGCTGTAGTTGTGACCAACTCCGCGAACCCTTGATGCAGTCTGCGGAT - 13380 
-WKGYGCSCDQLREPLMQSAD 
-GKVMAVVVTNSANP* CSLRM 
ERLWL*L* PTPRTLDAVCGC 
13381 - GCATCAACGTTTTTAAACGGGTTTGCGGTGTAAGTGCAGCCCGTCTTACACCGTGCGGCA - 13440 
-ASTFLNGFAV*VQPVLHRAA 
-HQRF±TGLRCKCSPSYTVRH 
INVFKRVCGVSAARLTPCGT 
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13441 - CAGGCACTAGTACTGATGTCGTCTACAGGGCTTTTGATATTTACAACGAAAAAAGTGCTG - 13500 
-QALVLMSSTGLLIFTTKKVL 
-RH*Y*CRLQGF*YLQRKKCW 
GTSTDVVYRAFDIYNEKSAG 
13501 - GTTTTGCAAAGTTCCTAAAAACTAATTGCTGTCGCTTCCAGGAGAAGGATGAGGAAGGCA - 13560 
-VLQSS*KLIAVASRRRMRKA 
-FCKVPKN*LLSLPGEG*GRQ 
FAKFLKTNCCRFQEKDEEGN 
13561 - ATTTATTAGACTCTTACTTTGTAGTTAAGAGGCATACTATGTCTAACTACCAACATGAAG - 13620 

- I Y * TLTL*LRG ILCLTTNMK 
-FIRLLLCS*EAYYV*LPT*R 

LLDSYFVVKRHTMSNYQHEE 
13621 - AGACTATTTATAACTTGGTTAAAGATTGTCCAGCGGTTGCTGTCCATGACTTTTTCAAGT - 13680 
-RLFI TWLKIVQRLLS MT FSS 
-DYL*LG*RLSSGCCP* LFQV 
TIYNLVKDCPAVAVHD FFKF 
13681 - TTAGAGTAGATGGTGACATGGTACCACATATATCACGTCAGCGTCTAACTAAATACACAA - 137 4 0 
-LE*MVTWYHIYHVSV* LNTQ 
-*SRW*HGTTYITSASN*IHN 
RVDGDMVPHISRQRLTKYTM 
13741 - TGGCTGATTTAGTCTATGCTCTACGTCATTTTGATGAGGGTAATTGTGATACATTAAAAG - 13800 

- W L I * SMLYVILMRVIVI, H*K 
~G*FSLCSTSF**G*L*YIKR 

ADLVYALRHFDEGNCDTLKE 
13801 - AAATACTCGTCACATACAATTGCTGTGATGATGATTATTTCAATAAGAAGGATTGGTATG - 13860 
-KYS SHTIAVMMI I S IRR IGM 
-NTRHIQLL***LFQ*EGLV* 
ILVTYNCCDDDYFNKKDWYD 
13861 - ACTTCGTAGAGAATCCTGACATCTTACGCGTATATGCTAACTTAGGTGAGCGTGTACGCC - 13920 
-TS*RILTSYAYMLT*VSVYA 
-LRRES*HLTRIC*LR*ACTP 
FVENPDILRVYANLGERVRQ 
13921 - AATCATTATTAAAGACTGTACAATTCTGCGATGCTATGCGTGATGCAGGCATTGTAGGCG - 13980 
-NHY*RLYNSAMLCVMQAL*A 
-IIIKDCTILRCYA*CRHCRR 
SLLKTVQFCDAMRDAG IVGV 
13981 - TACTGACATTAGATAATCAGGATCTTAATGGGAACTGGTACGATTTCGGTGATTTCGTAC - 14040 
-Y*H*IIRILMGTGTISVISY 

- T D I R * SGS*WELVRFR* FRT 

LTLDNQDLNGNWYDFGDFVQ 
14041 - AAGTAGGACCAGGCTGCGGAGTTCCTATTGTGGATTCATATTACTCATTGCTGATGCCCA - 14100 

- K * HQAAEFLLW IHI T HC * CP 
-SSTRLRSSYCGFILLIADAH 

VAPGCGVPIVDSYYSLLMPI 
14101 - TCCTCACTTTGACTAGGGCATTGGCTGCTGAGTCCCATATGGATGCTGATCTCGCAAAAC - 14160 
-SSL*LGHWLLSPIWMLISQN 
-PHFD*GIGC*VPYGC*SRKT 
LTLTRALAAESHMDADLAKP 
14161 - CAGTTATTAAGTGGGATTTGCTGAAATATGATTTTACGGAAGAGAGACTTTGTCTCTTCG - 14220 
-HLLSGI C * N M I LRKRDFVSS 
-TY*VGFAEI* FYGRETLSLR 
LIKWDLLKYDFTEERLCLFD 
14221 - ACCGTTATTTTAAATATTGGGACCAGACATACCATCCCAATTGTATTAACTGTTTGGATG - 1428 0 
-TVILNIGTRHT IPIVLTVWM 
-PLF*ILGPDIPSQLY*LFG* 
RYFKYWDQTYHPNCINCLDD 
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14281 - ATAGGTGTATCCTTCATTGTGCAAACTTTAATGTGTTATTTTCTACTGTGTTTCCACCTA - 14340 
-IGVSFIVQTLMCYFLLCFHL 
-*VYPSLCKL*CVI FYCVSTY 
RCILHCANFNVLFSTVFPPT 

14 341 - CAAGTTTTGGACCACTAGTAAGAAAAATATTTGTAGATGGTGTTCCTTTTGTTGTTTCAA - 14400 

- Q V L D H * *EKYL*MVFLLLiFQ 
-KFWTTSKKNICRWCSFCCFN 

SFGPLVRKIFVDGVPFVVST 
14 401 - CTGGATACCATTTTCGTGAGTTAGGAGTCGTACATAATCAGGATGTAAACTTACATAGCT - 144 60 
~LDTIFVS*ESYIIRM*TYIA 
-WIPFS*VRSRT*SGCKLT*L 
GYHFRELGVVHNQDVNLHSS 
14461 - CGCGTCTCAGTTTCAAGGAACTTTTAGTGTATGCTGCTGATCCAGCTATGCATGCAGCTT - 14520 
-RVSVSRNF*CMLLIQLCMQL 
-ASQFQGTFSVCC*SSYACSF 
RLSFKELLVYAADPAMHAAS 
14521 - CTGGCAATTTATTGCTAGATAAACGCACTACATGCTTTTCAGTAGCTGCACTAACAAACA - 14580 
-LA I Y C * INALHAFQ*LH * Q T 
-WQFIAR* THYMLFSSCTNKQ 
GNLLLDKRTTCFSVAALTNN 
14581 - ATGTTGCTTTTCAAACTGTCAAACCCGGTAATTTTAATAAAGACTTTTATGACTTTGCTG - 14640 
-MLLFKLSNPVILIKTFMTLL 
-CCFSNCQTR*F* * R. L L * L C C 
VAFQTVKPGNFNKDFYDFAV 
14641 - TGTCTAAAGGTTTCTTTAAGGAAGGAAGTTCTGTTGAACTAAAACACTTCTTCTTTGCTC - 14700 
-CLKVSliRKEVLLN * N T S SLL 

- V* RFL*GRKFC* TKTLLLCS 

SKGFFKEGSSVELKHFFFAQ 
14701 - AGGATGGCAACGCTGCTATCAGTGATTATGACTATTATCGTTATAATCTGCCAACAATGT - 147 60 
-RMATLLSVIMTIIVIICQQC 

- GWQRCYQ*L*LLSL*SANNV 

DGNAAI SDYDYYRYNLPTMC 
14761 - GTGATATCAGACAACTCCTATTCGTAGTTGAAGTTGTTGATAAATACTTTGATTGTTACG - 14820 
-VI SDNSYS * LKLLINTL IVT 
-*YQTTPIRS*SC**IL*LLR 
DIRQLLFVVEVVDKYFDCYD 
14821 - ATGGTGGCTGTATTAATGCCAACCAAGTAATCGTTAACAATCTGGATAAATCAGCTGGTT - 14880 
~MVAVLMPTK*SLTIWINQLV 
-WWLY*CQPSNR*QSG*ISWF 
GGCINANQVIVNNLDKSAGF 
14881 - TCCCATTTAATAAATGGGGTAAGGCTAGACTTTATTATGACTCAATGAGTTATGAGGATC - 14940 
-SHLINGVRLDFIMTQ* VMRI 
-PI**MG*G*TLL*LNEL*GS 
PFNKWGKARLYYDSMS YEDQ 
14941 - AAGATGCACTTTTCGCGTATACTAAGCGTAATGTCATCCCTACTATAACTCAAATGAATC - 15000 
-KMHFSRILSVMSSLL* L K * I 
-RCTFRVY*A*CHPYYNSNES 
DALFAYTKRNVIPTITQMNL 
15001 - TTAAGTATGCCATTAGTGCAAAGAATAGAGCTCGCACCGTAGCTGGTGTCTCTATCTGTA - 15060 
-LSMPLVQRIELAP*LVSLSV 
-*VCH*CKE*SSHRSWCLYL* 
KYAI SAKNRARTVAGVSICS 
15061 - GTACTATGACAAATAGACAGTTTCATCAGAAATTATTGAAGTCAATAGGCGCCACTAGAG - 15120 
-VL*QIDSFIRNY*SQ*PPLE 
-YYDK*TVSSEIIEVNSRH*R 
TMTNRQFHQKLLKS IAATRG 



FIG- 11 Con't 



WO 2004/085633 



PCT/CN2004/000248 



37/90 

15121 - GAGCTACTGTGGTAATTGGAACAAGCAAGTTTTACGGTGGCTGGCATAATATGTTAAAAA - 15180 
-ELLW *LEQASFTVAGI IC *K 

- SYCGNWNKQVLRWI>A*YVKN 

ATVVIGTSKFYGGWHNMLKT 
15181. - CTGTTTACAGTGATGTAGAAACTCCACACCTTATGGGTTGGGATTATCCAAAATGTGACA ~ 15240 
-LFTVM*KLHTLWVGI IQNVT 

- C L Q * CRNSTPYGLGLSKM*Q 

VYSDVETPHLMGWDYPKCDR 
15241 - GAGCCATGCCTAACATGCTTAGGATAATGGCCTCTCTTGTTCTTGCTCGCAAACATAACA - 15300 
-EPCLTCLG*WPLLFLLANIT 
-SHA*HA*DNGLSCSCSQT*H 
AMPNMLRIMASLVLARKHNT 
15301 - CTTGCTGTAACTTATCACACCGTTTCTACAGGTTAGCTAACGAGTGTGCGCAAGTATTAA - 15360 
-LAVTYHTVSTG*LTSVRKY* 
-LL*LITPFLQVS*RVCASIK 
CCNLSHRFYRLANECAQVLS 
15361 - GTGAGATGGTCATGTGTGGCGGCTCACTATATGTTAAACCAGGTGGAACATCATCCGGTG - 15420 
-VRWS CVAAHYMLNQVEHH PV 

- * DGHVWRLTIC*TRWNIIR* 

EMVMCGGSLYVKPGGTSSGD 
15421 - ATGCTACAACTGCTTATGCTAATAGTGTCTTTAACATTTGTCAAGCTGTTACAGCCAATG - 15480 
-MLQL LML I VSLT FVKLLQPM 
-CYNCLC**CL*HLSSCYSQC 
ATTAYAN SVFNI CQAVTANV 
15481 - TAAATGCACTTCTTTCAACTGATGGTAATAAGATAGCTGACAAGTATGTCCGCAATCTAC - 15540 
-*MHPPQLMVIR*LTSMSAIY 
-KCTSFN*W**DS*QVCPQST 
NALLSTDGNKIADKYVRNLQ 
15541 - AACACAGGCTCTATGAGTGTCTCTATAGAAATAGGGATGTTGATCATGAATTCGTGGATG - 15 600 
-NTGSMSVS IEIGMLIMNSWM 
-TQAL*VSL*K*GC*S*IRG* 
HRLYECL. YRNRDVDHEFVDE 
15 601 - AGTTTTACGCTTACCTGCGTAAACATTTCTCCATGATGATTCTTTCTGATGATGCCGTTG - 15660 
-SFTLTCVNISP* * FFLMMPL 
-VLRLPA*TFLHDDSF**CRC 
FYAYLRKHFSMMILS DDAVV 
15661 - TGTGCTATAACAGTAACTATGCGGCTCAAGGTTTAGTAGCTAGCATTAAGAACTTTAAGG - 15720 
-CAITVTMRLKV* * LALRTLR 
-VL*Q*LCGSRFSS * H * E L * G 
CYNSNYAAQGLVAS I KNFKA 
15721 - CAGTTCTTTATTATCAA2\ATAATGTGTTCATGTCTGAGGCAAAATGTTGGACTGAGACTG - 15780 

- Q F F I IKIMCSCLRQNVGLRL 
-SSLLSK*CVHV*GKMLD* D * 

VLYYQNNVFMSEAKCWTETD 
15781 - ACCTTACTAAAGGACCTCACGAATTTTGCTCACAGCATACAATGCTAGTTAAACAAGGAG - 158 40 
-TLLKDLTN FAHS I Q C * L N K E 

- PY*RTSRILLTAYNAS * T R R 

LTKGPHEFCSQHTMLVKQGD 
158 41 - ATGATTACGTGTACCTGCCTTAGCCAGATCCATCAAGAATATTAGGCGCAGGCTGTTTTG - 15900 
-MITCTCLTQI HQEY*AQAVL 
-*LRVPALPRSIKNIRRRLFC 
DYVYLPYPDPSRILGAGCFV 
15901 - TCGATGATATTGTCAAAACAGATGGTACACTTATGATTGAAAGGTTCGTGTCACTGGCTA - 15960 
-SMILSKQMVHL* LKGSCHWL 
-R*YCQNRWYTYD*KVRVTGY 
DDIVKTDGTLMIERFVSLAI 
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15 9 61 - TTGATGCTTACCCACTTACAAAACATCCTAATCAGGAGTATGCTGATGTCTTTCACTTGT - 16020 
-LMLTHLQNILIRSMLMSFTC 
-*CLPTYKTS*SGVC*CLSLV 
DAYPLTKHPNQEYADVFHLY 
16021 - ATTTACAATACATTAGAAAGTTACATGATGAGCTTACTGGCCACATGTTGGACATGTATT - 16080 
-IYNTLES YMMSLLATCWTCI 
-FTIH*KVT* * A Y W PHVGHVF 
LQYIRKLHDELTGHMLDMYS 
16081 - CCGTAATGCTAACTAATGATAACACCTCACGGTACTGGGAACCTGAGTTTTATGAGGCTA - 16140 

- P * C * LMI TPHGT GNLSFMRL 

- R N A N * * *HLTVLGT*VL* GY 

VMLTNDNTSRYWEPEFYEAM 
16141 - TGTACACACCACATACAGTCTTGCAGGCTGTAGGTGCTTGTGTATTGTGCAATTCACAGA - 16200 
-CTHHIQS CRL*VLVYCAIHR 
-VHTTYSLAGCRCLCIVQFTD 
YTPHTVLQAVGACVLCNSQT 
16201 - CTTCACTTCGTTGCGGTGCCTGTATTAGGAGACCATTCCTATGTTGCAAGTGCTGCTATG - 16260 
-LH FVAVPVLGDH SYVASAAM 

- FTSLRCLY* E T I PMLQVLL* 

SLRCGACIRRPFLCCKCCYD 
16261 - ACCATGTCATTTCAACATCACACAAATTAGTGTTGTCTGTTAATCCCTATGTTTGCAATG - 16320 
-TMSFQHHTN * C C L L I PMFAM 
-PCHFNITQISVVC*SLCLQC 
HVISTSHKLVLSVNPYVCNA 
16321 - CCCCAGGTTGTGATGTCACTGATGTGACACAACTGTATCTAGGAGGTATGAGCTATTATT - 16380 
-PQVVMSLM*HNCI*EV*AII 

- P R« L * C H * CDTTVSRRYELLL 

PGCDVTDVTQLYLGGMSYYC 
16381 ~ GCAAGTCACATAAGCCTCCCATTAGTTTTCCATTATGTGCTAATGGTCAGGTTTTTGGTT - 16440 
-ASHI SLPLVFHYVLMVRFLV 
-QVT*ASH*FSIMC*WSGFWF 
KSHKPPISFPLCANGQVFGL 
16441 - TATACAAAAACACATGTGTAGGCAGTGACAATGTCACTGACTTCAATGCGATAGCAACAT - 16500 
-YTKTHV*AVTMSLTSMR*QH 
IQKHMCRQ* QCH^LQCDSNM 
YKNTCVGS DNVT DFNAIATC 
16501 - GTGATTGGACTAATGCTGGCGATTACATACTTGCCAACACTTGTACTGAGAGACTCAAGC - 16560 
-VI GLMLAITYLPTLVLRDSS 
-*LD*CWRLHTCQHLY*ETQA 
DWTNAGDYILANTCTERLKL 
16561 - TTTTCGCAGCAGAAACGCTCAAAGCCACTGAGGAAACATTTAAGCTGTCATATGGTATTG - 16620 
-FS QQKRS KPLRKHLS CHMVL 
~FRSRNAQSH*GNI * A V I W Y C 
FAAETLKATEETFKLS YGIA 
16621 - CCACTGTACGCGAAGTACTCTCTGACAGAGAATTGCATCTTTCATGGGAGGTTGGAAAAC - 16680 
-PLYAKYSLTENC I FHGRLEN 
-HCTRSTL*QRIASFMGGWKT 
TVREVLSDRELHLSWEVGKP 
16681 - CTAGACCACCATTGAACAGAAACTATGTCTTTACTGGTTACCGTGTAACTAAAAATAGTA - 16740 
-LDHH*TETMSLLVTV*LKIV 
* TTIEQKLCLYWLPCN*K** 
RPPLNRNYVFTGYRVTKNSK 
16741 - AAG T ACAGATT GG AGAG T AC AC CTTTGAAAAAGGTG ACT AT GGTG AT GCTGTTGTGTACA - 16800 
-KYRLEST PLKKVTMVMLLCT 
-STDWRVHL*KR*LW*CCCVQ 
VQI GEYTFEKGDYGDAVVYR 
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16801 ~ GAGGTACTACGACATACAAGTTGAATGTTGGTGATTACTTTGTGTTGACATCTCACACTG - 168 60 
-EVLRHTS * MLVI TLC*HLTL 
-RYYDIQVECW*I)LCVDISHC 
GTTTYKLNVGDYFVLTSHTV 
16861 - T AAT GCC ACT T AGTGCACCT ACT C TAG TGCCACAAGAGC ACT AT GTGAGAATT AC TGGCT - 16920 
-*CHLVHLI>*CHKSTM*ELLA 
-NAT* CTYS SATRALCENYWL 
MPLSAPTLVPQEHYVRITGL 
16921 - TGTACCCAACACTCAACATCTCAGATGAGTTTTCTAGCAATGTTGCAAATTATCAAAAGG - 16980 
-CTQHSTSQMSFLAMLQ I IKR 
-VPNTQHLR*VF*QCCKLSKG 
YPTLNISDEFSSNVANYQKV 
16981 - TCGGCATGCAAAAGTACTCTACACTCCAAGGACCACCTGGTACTGGTAAGAGTCATTTTG - 17040 
-SACKSTLHSKDHLVLVRVIL 

- RHAKVLYTPRTTWYW*ESFC 

GMQKYSTLQGPPGTGKSHFA 
17 041 - CCATCGGACTTGCTCTCTATTACCCATCTGCTCGCATAGTGTATACGGCATGCTCTCATG - 17100 
-PSDLLS ITHLLA* CI RHALM 
-HRTCSLLPICSHSVYGMLSC 
I GLALYYPSARIVYTACSHA 
17101 - CAGCTGTTGATGCCCTATGTGAAAAGGCATTAAAATATTTGCCCATAGATAAATGTAGTA - 17160 
-QLLMPYVKRH*NICP*INVV 
-SC*CPM*KGIKIFAHR*M** 
AVDALCEKALKYLP I DKCSR 
17161 - GAATCATACCTGCGCGTGCGCGCGTAGAGTGTTTTGATAAATTCAAAGTGAATTCAACAC - 17220 
-E S YLRVRA* SVL INSK* IQH 
-NHTCACARRVF* * IQSEFNT 
I I PARARVECFDKFKVNSTL 
17221 - TAGAACAGTATGTTTTCTGCACTGTAAATGCATTGCCAGAAACAACTGCTGACATTGTAG - 17280 
-*NSMFSAL*MHCQKQLLTL* 
-RTVCFLHCKCIARNNC*HCS 
EQYVFCTVNALPETTADIVV 
17281 - TCTTTGATGAAATCTCTATGGCTACTAATTATGACTTGAGTGTTGTCAATGCTAGACTTC - 17340 
-SLMKSLWLLIMT *VLSMLDF 
-L* *NLYGY*L*LECCQC*TS 
FDE I SMATNYDLSVVNARLR 
17341 - GTGCAAAACACTACGTCTATATTGGCGATCCTGCTCAATTACCAGCCCCCCGCACATTGC - 174 00 
-VQNTTS ILAILLNYQ PPAHC 
-CKTLRLYWRSCSITSPPHIA 
AKHYVYIGDPAQLPAPRTLL 
17401 - TGACTAAAGGCACACTAGAACCAGAATATTTTAATTCAGTGTGCAGACTTATGAAAACAA - 174 60 

- * LKAH*NQNILIQCADL*KQ 

D * RHTRTRI F * FSVQTYENN 
TKGTLEPEYFNSVCRLMKTI 
17461 - TAGGTCCAGACATGTTCCTTGGAACTTGTCGCCGTTGTCCTGCTGAAATTGTTGACACTG - 17520 
-*VQTCSLELVAVVLLKLLTIi 
-RSRHVPWNLSPLSC*NC*HC 
GPDMFLGTCRRCPAEIVDTV 
17521 - TGAGTGCTTTAGTTTATGACAATAAGCTAAAAGCACACAAGGATAAGTCAGCTCAATGCT - 17580 
-*VL*FMTIS*KHTRISQLNA 
-ECFSL*Q*AKSTQG*VSSML 
SALVYDNKLKAHKDKSAQCF 
17581 - TCAAAATGTTCTACAAAGGTGTTATTACACATGATGTTTCATCTGCAATCAACAGACCTC - 17640 
-SKCSTKVLLHMMFHLQSTDL 
-QNVLQRCYYT*CFICNQQTS 
KMFYKGVITHDVSSAINRPQ 
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17641 - AAATAGGCGTTGTAAGAGAATTTCTTACACGCAATCCTGCTTGGAGAAAAGCTGTTTTTA - 17700 

- K * A L * EN FLHAI LLGEKLFL 

- NRRCKRISYTQSCLEKSCFY 

IGVVREFLTRNPAWRKAVFI 
17701 - TCTCACCTTATAATTCACAGAACGCTGTAGCTTCAAAAATCTTAGGATTGCCTACGCAGA - 17760 

- S H L I I H R T L * L Q K S * DCLRR 

- L T L * FTERCSFKNLRIAYAD 

SPYNSQNAVASKILGLPTQT 
17761 - CTGTTGATTCATCACAGGGTTCTGAATATGACTATGTCATATTCACACAAACTACTGAAA - 17820 

- L L I RHRVLNMTMSYSHKLLK 

- C * F I T G F * I*LCHIHTNY*N 

VDSSQGSEYDYVIFTQTTET 
17821 - CAGCACACTCTTGTAATGTCAACCGCTTCAATGTGGCTATCACAAGGGCAAAAATTGGCA - 17880 
-QHTLVMS TASMWLSQGQKLA 

- STLL*CQPLQCGYHKGKNWH 

AHSCNVNRFNVAITRAKIGI 
17881 - TTTTGTGCATAATGTCTGATAGAGATCTTTATGACAAACTGCAATTTACAAGTCTAGAAA - 17940 
-FCA * CLI E I FMT-NCN LQV * K 

- F V H N V * * RSLi* QTAIYKSRN 

LCIMS DRDLYDKLQFTSLEI 
17941 - TACCACGTCGCAATGTGGCTACATTACAAGCAGAAAATGTAACTGGACTTTTTAAGGACT ~ 18000 
-YHVAMWLHYKQKM* LDFLRT 
-TTSQCGYITSRKCNWTF* GL 
PRRNVATLQAENVTGLFKDC 
18 001 - GTAGTAAGATCATTACTGGTCTTCATCCTACACAGGCACCTACACACCTCAGCGTTGATA - 18060 
-VVRSLLVFILHRHLHTSALI 
* * DHYWSSSYTGTYTPQR*Y 
SKI ITGLHPTQAPTHLSVDI 
18 061 - T AAAATTCAAGAC TGAAGGATT AT GTGTTGACATACCAGGCATACCAAAGG ACATGACCT - 18120 
-*NSRLKDYVLTYQAYQRT*P 
-KIQD*RIMC*HTRHTKGHDL 
KFKTEGLCVDIPGIPKDMTY 
18121 - ACCGTAGACTCATCTCTATGATGGGTTTCAAAATGAATTACCAAGTCAATGGTTACCCTA - 1818 0 
-TVDSSL^WVSK^ITKSMVTL 
p * THLYDGFQNELPSQWLP* 
RRLISMMGFKMNYQVNGYPN 
18181 - ATATGTTTATCACCCGCGAAGAAGCTATTCGTCACGTTCGTGCGTGGATTGGCTTTGATG - 18240 
-ICLS PAKKLFVTFVRGLALM 
-YVYHPRRSYSSRSCVDWL*C 
MFITREEAIRHVRAWIGFDV 
18241 - TAGAGGGCTGTCATGCAACTAGAGATGCTGTGGGTACTAACCTACCTCTCCAGCTAGGAT - 18300 
-*RAVMQLEMLWVLTYIjS S * D 
-RGLSCN*RCCGY*PTSPARI 
EGCHATRDAVGTNLPLQLGF 
18301 - TTTCTACAGGTGTTAACTTAGTAGCTGTACGGACTGGTTATGTTGACACTGAAAATAACA - 183 60 
-FLQVLT * *LYRLVMLTLKIT 
-FYRC*LSSCTDWLC*H*K*H 
STGVNLVAVPTGYVDTENNT 
183 61 - CAGAATTCACCAGAGTTAATGCAAAACCTCCACCAGGTGACCAGTTTAAACATCTTATAC - 18420 

- Q N S PELMQNLHQVTSLN ILY 
-RIHQS*CKTSTR*PV*TSYT 

EFTRVNAKPPPGDQFKHLIP 
18421 - CACTCATGTATAAAGGCTTGCCCTGGAATGTAGTGCGTATTAAGATAGTACAAATGCTCA - 18480 

-hscikacpgm*cvlr*yk"cs 
thv* rlalecsay* dstnaq 
lmykglpwnvvrikivqmls 
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18 481 - GTGATACACTGAAAGGATTGTCAGACAGAGTCGTGTTCGTCCTTTGGGCGCATGGCTTTG - 18540 
-VI -H *KDCQTESCSS FGRMAL 
_*YTERIVRQSRVRPLGAWL* 
DTLKGLS DRVVFVLWAHGFE 

18 541 - AGCTTACATCAATGAAGTACTTTGTCAAGATTGGACCTGAAAGAACGTGTTGTCTGTGTG - 18600 

- S L H Q * STLSRLDLKERVVCV 
-AYINEV. LCQDWT^KNVLSV* 

LTSMKYFVKIGPERTCCLCD 
18 601 - ACAAACGTGCAACTTGCTTTTCTACTTCATCAGATACTTATGCCTGCTGGAATCATTCTG - 18 660 
-TNVQLAFLLHQILMPAG I IL 
-QTCNLLFYFIRYLCLLESFC 
KRATCFSTSSDTYACWNHSV 
18 661 - TGGGTTTTGACTATGTCTATAACCCATTTATGATTGATGTTCAGCAGTGGGGCTTTACGG - 18720 
-WVLTMS ITHL*LMFSSGALR 
-GF*LCL* PIYD*CSAVGLYG 
GFDYVYNPFMIDVQQWGFTG 
18721 - GTAACCTTCAGAGTAACCATGACCAACATTGCCAGGTACATGGAAATGCACATGTGGCTA - 18780 
-VTFRVTMTN IARYMEMHMWL 
-*PSE*P*PTLPGTWKCTCG* 
NLQSNHDQHCQVHGNAHVAS 
187 81 - GTTGTGATGCTATCATGACTAGATGTTTAGCAGTCCATGAGTGCTTTGTTAAGCGCGTTG - 18 840 
-VVMLS * LDV*QSMSALIiSAIi 
-L*CYHD*MFSSP*VLC*AR* 
CDAIMTRCLAVHECFVKRVD 
18 841 - ATTGGTCTGTTGAATACCCTATTATAGGAGATGAACTGAGGGTTAATTCTGCTTGCAGAA - 18900 
-IGLLNTLL^EMN* GLILLAE 
-LVC*IPYYRR*TEG*FCLQK 
WSVEYPIIGDELRVNSACRK 
18 901 - AAGTACAACACATGGTTGTGAAGTCTGCATTGCTTGCTGATAAGTTTCCAGTTCTTCATG - 18960 
-KYNTWL* S LHCLLI S FQFFM 
-STTHGCEVCIAC**VSSSS* 
VQHMVVKSALLADKFPVLHD 
18 961 - ACATTGGAAATCCAAAGGCTATCAAGTGTGTGCCTCAGGCTGAAGTAGAATGGAAGTTCT - 19020 

- T L E IQRLSSVCLRLK*NGSS 
-HWKSKGYQVCASG* SRMEVL 

IGNPKAIKCVPQAEVEWKFY 
19021 - ACGATGCTCAGGCATGTAGTGACAAAGCTTACAAAATAGAGGAACTCTTCTATTCTTATG - 19080 
~TM-LSHVVTKLTK*RNSSILM 
-RCSAM* *QSLQNRGTLLFLC 
DAQPCS DKAYKIEELFYSYA 
19081 - CTACACATCACGATAAATTCACTGATGGTGTTTGTTTGTTTTGGAATTGTAACGTTGATC - 19140 

- L H I TINSLMVFVCFGIVTLI 
-YTSR*IH*WCLFVLEL*R*S 

THHDKFTDGVCLFWNCNVDR 
19141 - GTTACCCAGCCAATGCAATTGTGTGTAGGTTTGACACAAGAGTCTTGTCAAACTTGAACT - 192 00 
-VTQPMQLCVGLTQESCQT*T 
-LPSQCNCV*V*HKSLVKLEL 
YPANAIVCRFDTRVLSNLNL 
19201 - TACCAGGCTGTGATGGTGGTAGTTTGTATGTGAATAAGCATGCATTCCACACTCCAGCTT - 19260 
-YQAVMVVVCM* ISMH5TLQL 
-TRL*WW*FVCE*ACIPHSSF 
PGCDGGSLYVNKHAFHTPAF 
19261 - TCGATAAAAGTGCATTTACTAATTTAAAGCAATTGCCTTTCTTTTACTATTCTGATAGTC - 19320 

- S I K. VHLLI*SNCLSFTILIV 
-R*KCIY* FKAIAFLLLF* * S 

DKSAFTMLKQLPFFYYSDSP 
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19321 - CTTGTGAGTCTCATGGCAAAGAAGTAGTGTCGGATATTGATTATGTTCCACTCAAATCTG - 19380 
-LV S L MANK* CRILIM F H SNL 
-L*VSWQTSSVGY*LCSTQIC 
CESHGKQVVSDI DYVPLKSA 

19381 - CTACGTGTATTACACGATGCAATTTAGGTGGTGCTGTTTGCAGACACCATGCAAATGAGT - 19440 
-LRVLHDAI *VVLFADTMQMS 

yvyytmqfrwcclqtpck*v 
tcitrcnlggavcrhhaney 
19441 - accgacagtacttggatgcatataatatgatgatttctgctggatttagcctatggattt - 19500 
-tdstwmhi i* *flldlaygf 
-ptvlgci*yddfcwi* pmdl 
rqyldaynmmisagfslwi'y 
19501 - acaaacaatttgatacttataacctgtggaatacatttaccaggttacagagtttagaaa - 19560 
-tnnli li tcgihlpgyrv*k 
~qti*yl* pveyiyqvtefrk 
kqfdtynlwntftrlqslen 
19561 - atgtggcttataatgttgttaataaaggacactttgatggacacgccggcgaagcacctg - 19620 
-mwl imlli kdtlmdt pakhl 

-CGL*CC**RTL*_WTRRRSTC 
VAYNVVNKGHFDGHAGEAPV 
19621 - TTTCCATCATTAATAATGCTGTTTACACAAAGGTAGATGGTATTGATGTGGAGATCTTTG ~ 19680 
-FPSLIMLFTQR*MVLMWRSL 

- F H H * *CCLHKGRWY*CGDL* 

SIINNAVYTKVDGIDVEIFE 
19681 - AAAATAAGACAACACTTCCTGTTAATGTTGCATTTGAGCTTTGGGCTAAGCGTAACATTA - 19740 

- K I RQH FLLMLHLSFGLS V T L 
-K*DNTSC*CCI*ALG*A*H* 

NKTTLPVNVAFELWAKRNIK 
19741 - AACCAGTGCCAGAGATTAAGATACTCAATAATTTGGGTGTTGATATCGCTGCTAATACTG - 19800 
-NQCQRLRYSIIWVLISLLIL 
-TSARD*DTQ*FGC*YRC*YC 
PVPEIKILNNLGVDIAANTV 
19801 - TAATCTGGGACTACAAAAGAGAAGCCCCAGCACATGTATCTACAATAGGTGTCTGCACAA - 198 60 

- * SGTTKEKPQHMYLQ*VSAQ 
-NLGLQKRSPSTCIYNRCLHN 

IWDYKREAPAHVSTIGVCTM 
19861 - TGACTGACATTGCCAAGAAACCTACTGAGAGTGCTTGTTCTTCACTTACTGTCTTGTTTG - 19920 

- * LTLPRNLLRVLVLHLLSCL 

D*HCQETY*ECLFFTYCLV* 
TDIAKKPTESACSSLTVLFD 
19921 - ATGGTAGAGTGGAAGGACAGGTAGACCTTTTTAGAAACGCCCGTAATGGTGTTTTAATAA - 19980 
-MVEWKDR* TFLETPVMVF* * 
~W*SGRTGRPF*KRP*WCFNN 
GRVEGQV DLFRNARNGVLIT 
19981 - CAGAAGGTTCAGTCAAAGGTCTAACACCTTCAAAGGGACCAGCACAAGCTAGCGTCAATG - 20040 
-QKVQS K V * HLQRDQHKLASM 

- RRFSQRSNTFKGTSTS * R Q W 

EGSVKGLTPSKGPAQASVNG 
20041 - GAGTCACATTAATTGGAGAATCAGTAAAAACACAGTTTAACTACTTTAAGAAAGTAGACG - 20100 
-ESH*LENQ*KHSLTTLRK*T 

- SHINWRI SKNTV*LL* ESRR 

VTLIGESVKTQFNYFKKVDG 
20101 - GCATTATTCAACAGTTGCCTGAAACCTACTTTACTCAGAGCAGAGACTTAGAGGATTTTA - 20160 
-ALFNS CLKPTLLRAE T * R I L 
-HYSTVA*NLLYSEQRLRGF* 
IIQQLPETYFTQSRDLEDFK 
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20161 - AGCCCAGATCACAAATGGAAACTGACTTTCTCGAGCTCGCTATGGATGAATTCATACAGC - 20220 

- S P DHKWKLTFSS SLWMN SYS 
-AQITNGN*LSRARYG* IHTA 

PRSQMETDFLELAMDE FIQR 
20221 - GATATAAGCTCGAGGGCTATGCCTTCGAACACATCGTTTATGGAGATTTCAGTCATGGAC - 20280 
-DI SSRAMPSNTSFMEISVMD 
I *ARGLCLRTHRLWRFQSWT 
YKLEGYAFEHIVYGDFSHGQ 
20281 - AACTTGGCGGTCTTCATTTAATGATAGGCTTAGCCAAGCGCTCACAAGATTCACCACTTA - 20340 
-NLAVFI* **A*PSAHKIHHL 
-TWRS SFNDRLSQALTRFTT'* 
LGGLHLMIGLAKRSQDSPLK 
20341 - AATTAGAGGATTTTATCCCTATGGACAGCACAGTGAAAAATTACTTCATAACAGATGCGC - 2 04 00 
-N*RILSLWTAQ*KITS * QMR 

- IRGFYPYGQHSEKLLHNRCA 

LEDFI PMDSTVKNYFI TDAQ 
204 01 - AAACAGGTTCATCAAAATGTGTGTGTTCTGT6ATTGATCTTTTACTTGATGACTTTGTCG - 2 04 60 
-KQVHQNVCVL*LI FYLM TLS 

- NRF I KMCVFCD* S FT * * LCR 

TGSSKCVCSVIDLLLDDFVE 
204 61 - AGATAATAAAGTGACAAGATTTGTCAGTGATTTCAAAAGTGGTCAAGGTTACAATTGAGT - 20520 

- R * * SHKICQ* FQKWSRLQLT 
-DNKVTRFVSDFKSGQGYN * L 

IIKSQDLSVISKVVKVTIDY 
20521 - ATGCTGAAATTTCATTCATGCTTTGGTGTAAGGATGGACATGTTGAAACCTTCTACCCAA - 20580 
-MLKFHSCFGVRMDMLKP STQ 
-C*NFIHALV*GWTC*NLLPK 
AEI SFMLWCKDGHVETFYPK 
20581 - AACTACAAGCAAGTCAAGCGTGGCAACCAGGTGTTGCGATGCCTAACTTGTACAAGATGC - 20640 
-NYKQVKRGNQVLRCLTC TRC 
-TTSKSSVATRCCDA*I»VQDA 
LQASQAWQPGVAM PNLYKMQ 
20641 - AAAGAATGCTTCTTGAAAAGTGTGACCTTCAGAATTATGGTGAAAATGCTGTTATACCAA - 20700 
-KECFLKSVTFRIMVKMLLYQ 
-KNAS*KV*PSELW*KCCYTK 
RMLLEKCDLQNYGENAVIPK 
20701 - AAGGAATAATGATGAATGTCGCAAAGTATACTCAACTGTGTCAATACTTAAATACACTTA - 207 60 
-KE***MSQSILNCVNT*IHL 
-RNNDECRKVYSTVS ILKYTY 
GIMMNVAKYTQLCQYLNTLT 
20761 - CTTTAGCTGTACCCTACAACATGAGAGTTATTCACTTTGGTGCTGGCTCTGATAAAGGAG - 20820 

- L * LYPTT*ELFTLVLALIKE 
-FSCTLQHESYSLWCWL**RS 

LAVPYNMRVIHFGAGSDKGV 
20821 - TTGCACCAGGTACAGCTGTGCTCAGACAATGGTTGCCAACTGGCACACTACTTGTCGATT - 20880 
-LHQVQLCSDNGCQLAHYLSI 
-CTRYSCAQTMVANWHTTCRF 
APGTAVLRQWLPTGTLLVDS 
20881 - CAGATCTTAATGACTTCGTCTCCGACGCAGATTCTACTTTAATTGGAGACTGTGCAACAG - 20940 
-QILMTSSPTQI LL*LE'CVQQ 
-RS**LRLRRRFYFNWRLCNS 
DLNDFVSDADSTLIGDCATV 
20941 - TACATACGGCTAATAAATGGGACCTTATTATTAGCGATATGTATGACCCTAGGACCAAAC - 21000 
-YIRLINGTLLLAICMTLGPN 
-TYG**MGPYY*RYV*P±DQT 
HTANKWDLIISDMYDPRTKH 
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21001 - ATGTGACAAAAGAGAATGACTCTAAAGAAGGGTTTTTCACTTATCTGTGTGGATTTATAA - 21060 

- M * QKRMTLKKGFSLICVDL*" 

- CDKRE*L*RRVFHLSVWIYK 

VTKENDSKEGFFTYLCGFIK 
21061 - AGCAAAAACTAGCCCTGGGTGGTTCTATAGCTGTAAAGATAACAGAGCATTCTTGGAATG - 21120 
-SKN*PWVVL*L*R*QSILGM 
-AKTS PGWFYSCKDNRAFLEC 
QKLALGGSIAVKITEHSWNA 
21121 - CTGACCTTTACAAGCTTATGGGCCATTTCTCATGGTGGACAGCTTTTGTTACAAATGTAA - 21180 

- L T FT SLWAI SHGGQLLLQM* 

* PLQAYGPFLMVDSFCYKCK 
DLYKLMGHFSWWTAFVTNVN 
21181 - ATGCATCATCATCGGAAGCATTTTTAATTGGGGCTAACTATCTTGGCAAGCCGAAGGAAC - 21240 
-MHHHRKHF*LGLTILASRRN 
-CIIIGSIFNWG*LSWQAEGT 
A S S S EAFLI GANYL G K P KEQ 
21241 - AAATTGATGGCTATACCATGCATGCTAACTACATTTTCTGGAGGAACACAAATCCTATCC - 21300 
-KLMA I PCMLTTFSGGTQILS 
-N*WLYHAC*LHFLEEHKSYP 
I DGYTMHANYIFWRNTNPIQ 
21301 - AGTTGTCTTCCTATTCACTCTTTGACATGAGCAAATTTCCTCTTAAATTAAGAGGAACTG - 21360 
~SCLPIHSLX , *ANFLLN*EEL 

- VVFLFTL*HEQISS* IKRNC 

LSSYSLFDMSKFPLKLRGTA 
213 61 - CTGTAATGTCTCTTAAGGAGAATCAAATCAATGATATGATTTATTCTCTTCTGGAAAAAG - 21420 

- L * CLLRRIKSMI * FILFWKK 
-CNVS*GBSNQ*YD-LFSSGKR 

VMSLKENQINDMIYSLLEKG 
21421 - GTAGGCTTATCATTAGAGAAAACAACAGAGTTGTGGTTTCAAGTGATATTCTTGTTAACA - 21480 
-VGLSLEKTTELW FQVI FLLT 

- *AYH*RKQQSCGFK*YSC*Q 

RLIIRENNRVVVSSDILVNN 
21481 - ACTAAACGAACATGTTTATTTTCTTATTATTTCTTACTCTCACTAGTGGTAGTGACCTTG - 21540 
-TKRTCLFSYYFLLSLVVVTL 
-LNEHVYFLIISYSH*W**P* 
*TNMFIFLLFLTLTSGSDLD 
21541 - ACCGGTGCACCACTTTTGATGATGTTCAAGCTCCTAATTACACTCAACATACTTCATCTA - 21600 
-T GAPLLMMFKLL I TLNI LHL 

- PVHHF**CSSS*LHSTYFIY 

RCTTFDDVQAPNYTQHTSSM 
21601 - TGAGGGGGGTTTACTATCCTGATGAAATTTTTAGATCAGACACTCTTTATTTAACTCAGG - 21660 

- * GGFTILMKFLDQTLFI * L R 
-EGGLLS**NF*IRHSLFNSG 

RGVYYPDEIFRSDTLYLTQD 
21661 - ATTTATTTCTTCCATTTTATTCTAATGTTACAGGGTTTCATACTATTAATCATACGTTTG - 21720 

- I YFFHFILMLQGFI LLI IRL 
-FISSILF*CYRVSYY*SYVW 

LFLPFYSNVTGFHTINHTFG 
21721 - GCAACCCTGTCATACCTTTTAAGGATGGTATTTATTTTGCTGCCACAGAGAAATCAAATG - 21780 
-A TLS YLLRMVFI LLPQRNQM 

- QPCHTF*GWYLFCCHREIKC 

N PVI PFKDGIYFAATEKSNV 
21781 - TTGTCCGTGGTTGGGTTTTTGGTTCTACCATGAACAACAAGTCACAGTCGGTGATTATTA - 21840 
-LSVVGFLVLP*TTSHSR*LL 
-CPWLGFWFYHEQQVTVGDYY 
VRGWVFGSTMNNKSQSVIII 
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21841 - TTAACAATTCTACTAATGTTGTTATACGAGCATGTAACTTTGAATTGTGTGACAACCCTT - 21900 
-LTI LLMLLYEHVTIiNCVTTL 
-*QFY*CCYTSM*L*IV*QPF 
NNSTNVVIRACNFELCDNPF 
21901 - TCTTTGCTGTTTCTAAACCCATGGGTAGACAGACACATACTATGATATTCGATAATGCAT - 21960 
-SLLFLNPWVHRHIL* Y S IMH 
-LCCF*THGYTDTYYDIR* CI 
FAVSKPMGTQTHTMI F D N A F 
21961 - TXAATTGCACTTTCGAGTACATATCTGATGCCTTTTCGCTTGATGTTTCAGT^AAAGTCAG - 22020 
-LIALSSTYLMPFRLMFQKSQ 
-*LHFRVHI*CLFA*CFRKVR 
NCTFEYISDAFSLDVSEKSG 
22021 - GTAATTTTAAACACTTACGAGAGTTTGTGTTTAAAAATAAAGATGGGTTTGTCTATGTTT - 22080 
-VILNTYESLCLKIKMGFSMF 

- *F*TLTRVCV*K*RWVSLCL 

NFKHLREFVFKNKDGFLYVY 
22081 - ATAAGGGCTATCAACCTATAGATGTAGTTCGTGATCTACCTTCTGGTTTTAACACTTTGA - 22140 
-I RAINL*M* FVI YLLVLTL* 
*GLSTYRCSS*STFWF*HFE 
KGYQPIDVVRDLPSGFNTLK 
22141 - AACCTATTTTTAAGTTGCCTCTTGGTATTAACATTACAAATTTTAGAGCCATTCTTACAG - 22200 
-NLFLSCLLVLTLQIL EP FLQ 
TYF* VASWY*HYKF* SHSYS 
PIFKLPLGINITNFR AILTA 
22201 - CCTTTTCACCTGCTCAAGACATTTGGGGCACGTCAGCTGCAGCCTATTTTGTTGGCTATT - 222 60 

- P FHLLKTFGARQ LQP I LLAI 
-LFTCSRHLGHVSCSLFCWLF 

FSPAQDIWGTSAAAY FVGYL 
22261 - TAAAGCCAACTACATTTATGCTCAAGTATGATGAAAATGGTACAATCACAGATGCTGTTG - 22320 

- * SQLHLCS SMMKMVQSQMLL 
-KANY IYAQV* *KWYNHRCC* 

KPTTFMLKYDENGTI TDAVD 
22321 - ATTGTTCTCAAAATCCACTTGCTGAACTCAAATGCTCTGTTAAGAGCTTTGAGATTGACA - 223 80 
-IVLKIHLLNSNALLRALRLT 
-LFSKSTC*TQMLC*EIi*D*Q 
CSQNPLAELKCSVKSFEIDK 
22381 - AAGGAATTTACCAGACCTCTAATTTCAGGGTTGTTCCCTCAGGAGATGTTGTGAGATTCC - 22440 
-KEFTRPLI SGLFPQEML * DS 
-RNLPDL* FQGCSLRRCCEIP 
GIYQTSNFRVVPSGDVVRFP 
22441 - CTAATATTACAAACTTGTGTCCTTTTGGAGAGGTTTTTAATGCTACTAAATTCCCTTCTG - 22500 
-LILQTCVLLERFLMLLNSLL 

- *YYKLVSFWRGF*CY* IPFC 

NITNLCPFGEVFNATKFPSV 
22501 - TCTATGCATGGGAGAGAAAAAAAATTTCTAATTGTGTTGCTGATTACTCTGTGCTCTACA - 22560 
-SMHGREKKFLIVLLI TLCST 
-LCMGEKKNF* LCC* LLCALQ 
YAWERKKI SNCVADYSVLYN 
22561 - ACTCAACATTTTTTTCAACCTTTAAGTGCTATGGCGTTTCTGCCACTAAGTTGAATGATC - 22620 
-TQH FFQPLSAMAFLPLS * M I 
-LNIFFNL*VLWRFCH*VE*S 
STFFSTFKCYGVSATKLNDL 
2 2621 - TTTGCTTCTCCAATGTCTATGCAGATTCTTTTGTAGTCAAGGGAGATGATGTAAGACAAA - 22680 
-FAS PMSMQILL* SREMM*DK 
LLLQCLCRFFCS QGR* CKTN 
CFSNVYADSFVVKGDDVRQI 
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22 681 - TAGCGCCAGGACAAACTGGTGTTATTGCTGATTATAATTATAAATTGCCAGATGATTTCA - 22740 

- * RQ DKLVLLLI I I INCQMI S 
-SARTNWCYC*L*Ii*IAR*FH 

apgqtgviadynyklp ddfm 
22741 - tgggttgtgtccttgcttggaatactaggaacattgatggtacttcaactggtaattata - 22800 
-wvvsllg i lgtlmllqlvi i 
-glcpcley*eh*c'yfnw*l* 
gcvlawntrni datstgnyn 
22801 - attataaatataggtatcttagacatggcaagcttaggccctttgagagagacatatcta - 22860 
-iinigildmaslgplr'btyl 
-l*i*vs*twqa*al*erhi* 
ykyrylrhgklrpferdisn 
22861 - atgtgcctttctcccctgatggcaaaccttgcaccccacctgctcttaattgttattggc - 22920 
-mclsplmanlaphlllivig 
-caflp*wqtlhptcs*llla 
vpfspdgkpctppalncywp 
22921 - cattaaatgattatggtttttacaccactactggcattggctaccaaccttacagagttg - 22 980 

- H * M IMVFT PLLALATNLTEL 

IK*LWFLHHYWHWLPTLQSC 
LNDYGFYTTTGIGYQPYRVV 
22981 - TAGTACTTTCTTTTGAACTTTTAAATGCACCGGCCACGGTTTGTGGACCAAAATTATCCA - 23040 
-*YFLLNF*MHRPRFVDQNYP 
-STFF*TFKCTGHGLWTKI IH 
VLS FELLNAPATVCGPKLST 
23041 - CTGACCTTATTAAGAACCAGTGTGTCAATTTTAATTTTAATGGACTCACTGGTACTGGTG - 23100 
-LTLLRTSVS I LILMDSLVLV 
-*PY*EPVCQF*F*WTHWYWC 
DLIKNQCVNFNFNGLTGTGV 
23101 - TGTTAACTCCTTCTTCAAAGAGATTTCAACCATTTCMCAATTTGGCCGTGATGTTTCTG - 231-60 

- C * LLLQRD FNHFNNLAVMFL 
-VNSFFKEISTISTIWP*CF* 

LTPSSKRFQPFQQFGRDVSD 
23161 - ATTTCACTGATTCCGTTCGAGATCCTAAAACATCTGAAATATTAGACATTTCACCTTGCT - 23220 
-I SL IPFEILKHLKY* TFHLA 
FH*FRSRS * N I *NIRHFTLL 
FTDSVRDPKTSEILDISPCS 
23221 - CTTTTGGGGGTGTAAGTGTAATTACACCTGGAACAAATGCTTCATCTGAAGTTGCTGTTC - 23280 
-LLGV*V*LHLBQMLHLKLLF 
-FWGCKCNYTWNKCFI*SCCS 
FGGVSVITPGTNASSEVAVL 
23281 - TATATCAAGATGTTAACTGCACTGATGTTTCTACAGCAATTCATGCAGATCAACTCACAC - 23340 
-YIKMLTALMFLQQFMQINSH 
-ISRC*LH*CFYSNSCRSTHT 
YQDVNCTDVSTAIHADQLTP 
23341 - CAGCTTGGCGCATATATTCTACTGGAAACAATGTATTCCAGACTCAAGCAGGCTGTCTTA - 23400 
-QLGAY I LLETMYSRLKQAVL 
SLAHI FYWKQCI PDSSRLSY 
AWRIYSTGNNVFQTQAGCLI 
234 01 - TAGGAGCTGAGCATGTCGACACTTCTTATGAGTGCGACATTCCTATTGGAGCTGGCATTT - 23460 

- * E L S M S TLLMSATFLLELAF 

- RS*ACRHFL*VRHSYWSWHIj 

GAEHVDTSYECDIPIGAGIC 
23461 - GTGCTAGTTACCATACAGTTTCTTTATTACGTAGTACTAGCCAAAAATCTATTGTGGCTT - 23520 
-VLVTIQFLYYVVLAKNLLWL 
-C*LPYSFFIT*Y*PKIYCGL 
ASYHTVSLLRSTSQKSIVAY 
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23521 - ATACTATGTCTTTAGGTGCTGATAGTTCAATTGCTTACTCTAATAACACCATTGCTATAC - 23580 
-ILCL*VLIVQLIiTI,ITPLLY 
-YYVFRC**FNCLL**HHCYT 
TMSLGADSSIAYSNNTIAIP 

23581 - CTACTAACTTTTCAATTAGCATTACTACAGAAGTAATGCCTGTTTCTATGGCTAAAACCT - 23640 
-LLTFQLALLQK* CLFL WLKP 

- Y * LFN*HYYRSNACFYG*NL 

TNFSISITTEVMPVSMAKTS 
23641 - CCGTAGATTGTAATATGTACATCTGCGGAGATTCTACTGAATGTGCTAATTTGCTTCTCC - 23700 

- P * IVICTSAEILLNVLICFS 
-RRL*YVHLRRFY*MC* FASP 

VDCNMYICGDSTECANLLLQ 
23701 - AATATGGTAGCTTTTGCACACAACTAAATCGTGCACTCTCAGGTATTGCTGCTGAACAGG - 23760 
-NMVAFAHN* IVHSQVLLLNR 

- IW*LLHTTKSCTLRYCC* TG 

YGSFCTQLNRALSGIAAEQD 
23761 - ATCGCAACACACGTGAAGTGTTCGCTCAAGTCAAACAAATGTACAAAACCCCAACTTTGA - 23820 
-I ATHVKCS LKSNKCT K P Q L * 

- S Q H T * SVRSSQTNVQNPNFE 

RNTREVFAQVKQMYKT PTLK 
23821 - AATATTTTGGTGGTTTTAATTTTTCACAAATATTACCTGACCCTCTAAAGCCAACTAAGA - 23880 
-NILVVLI FHKYYLTL * SQLR 
-IFWWF*FFTNIT*PSKAN*E 
YFGGFNFSQI LPD PLKPTKR 
23881 - GGTCTTTTATTGAGGACTTGCTCTTTAATAAGGTGACACTCGCTGATGCTGGCTTCATGA - 23940 
-GLLLRTCS LI R * HSLMLAS* 
-VFY*GLAL**GDTR*CWLHE 
SFIEDLLFNKVTLADAGFMK 
23941 - AGCAATATGGCGAATGCCTAGGTGATATTAATGCTAGAGATCTCATTTGTGCGCAGAAGT - 24000 
-SNMANA*V ILMLE IS FVRRS 
-AIWRMPR*Y*C*RSHLCAEV 
QYGECLGDINARDLICAQKF 
24001 - TCAATGGACTTACAGTGTTGCCACCTCTGCTCACTGATGATATGATTGCTGCCTACACTG - 240 60 
-SMDLQCCHLCSLM I* LLPTL 

- QWTYSVATSAH* *YDCCLHC 

NGLTVLPPLLTDDMIAAYTA 
24061 - CTGCTCTAGTTAGTGGTACTGCCACTGCTGGATGGACATTTGGTGCTGGCGCTGCTCTTC - 24120 

- L L * LVVLP LLDGHLVLALLF 

- CSS*WYCHCWMDIWCWRCSS 

ALVSGTATAGWTFGAGAALQ 
2 4121 - AAATACCTTTTGCTATGCAAATGGCATATAGGTTCAATGGCATTGGAGTTACCCAAAATG - 24180 
-KYLLLCKWHIGSMALELPKM 
-NTFCYANGI *VQWHWSYPKC 
I PFAMQMAYRFNGIGVTQNV 
24181 - TTCTCTATGAGAACCAAAAACAAATCGCCAACCAATTTAACAAGGCGATTAGTCAAATTC - 2424 0 
-FSMRTKNKS PTNLTRRLVKF 
-SL*EPKTNRQPI *QGD*SNS 
LYENQKQIANQFNKAI SQIQ 
24241 - AAGAATCACTTACAACAACATCAACTGCATTGGGCAAGCTGCAAGACGTTGTTAACCAGA - 24300 
-KNHLQQHQLHWASCKTLLTR 
-RITYNNINCIGQAARRC*PE 
ESLTTTSTALGKLQDVVNQN 
24301 - ATGCTCAAGCATTAAACACACTTGTTAAACAACTTAGCTCTAATTTTGGTGCAATTTCAA - 24360 
-MLKH * THLLNNLALI LVQFQ 
-CSSIKHTC*TT*L*FWCNFK 
AQALNTLVKQLSSNFGAISS 
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24 361 - GTGTGCTAAATGATATCCTTTCGCGACTTGATAAAGTCGAGGCGGAGGTACAAATTGACA - 2 4420 
-VC *MI SFRDL IKSRRRYKLT 

- C A K * Y P F A T * * SRGGGTN*Q 

VLNDILSRLDKVEAEVQIDR 
24 421 - GGTTAATTACAGGCAGACTTCAAAGCCTTCAAACCTATGTAACACAACAACTAATCAGGG - 24480 

- G * LQADFKAFKPM* HNN * S G 
-VNYRQTSKPSNLCNTTTNQG 

LITGRLQSLQTYVTQQLIRA 
24 4 81 - CTGCTGAAATCAGGGCTTCTGCTAATCTTGCTGCTACTAAAATGTCTGAGTGTGTTCTTG - 2 4 540 
-LLKSGLLLILLLLKCLSVFL 
-C*NQGFC*SCCY*NV*VCSW 
AEI RASANLAATKMSECVLG 
24541 - GACAATCAAAAAGAGTTGACTTTTGTGGAAAGGGCTACCACCTTATGTCCTTCCCACAAG - 24600 
-DNQKELT FVERATTLCP SHK 
-TIKKS*LLWKGLPPYVLPTS 
QSKRVDFCGKGYHLMS FPQA 
2 4601 - CAGCCCCGCATGGTGTTGTCTTCCTACATGTCACGTATGTGCCATCCCAGGAGAGGAACT - 24660 
~Q PRMVLSS YMSRMCH PRRGT 
-SPAWCCLPTCHVCAIPGEEL 
APHGVVFLHVTYVPSQERNF 
24661 - TCACCACAGCGCCAGCAATTTGTCATGAAGGCAAAGCATACTTCCCTCGTGAAGGTGTTT - 24720 

- S PQRQQFVMKAKHT S LVKVF 
-HHSASNIiS*RQSILPS*RCF 

TTAPAICHEGKAYFPREGVF 
2 4721 - TTGTGTTTAATGGCACTTCTTGGTTTATTACACAGAGGAACTTCTTTTCTCCACAAATAA - 247 80 
-LCLMALLGLLHRGTS F L H K * 
-CV*WHFLVYYTEELLFSTNN 
VFNGTSWFI TQRNFFS PQII 
24781 - TTACTACAGACAATACATTTGTCTCAGGAAATTGTGATGTCGTTATTGGCATCATTAACA - 24840 
-LLQTIHLSQEIVMSLLASLT 

- YYRQYICLRKL* CRYWHH*Q 

TTDNTFVSGNCDVVIG I I N N 
24841 - ACACAGTTTATGATCCTCTGCAACCTGAGCTTGACTCATTCAAAGAAGAGCTGGACAAGT - 24 900 

- T Q FM ILCNLS 'LTHS KKSWTS 

- H S L * SSAT*A*LIQRRAGQV 

TVYDPLQPELDSFKEELDKY 
24901 - ACTTCAAAAATCATACATCACCAGATGTTGATCTTGGCGACATTTCAGGGATTAACGCTT - 24960 
-TSKI IHHQMLILATFQALTL 
-LQKSYITRC* SWRHFRH*RF 
F KNHTSP DVDLGDI SG INAS 
24 9 61 - CTGTCGTCAACATTCAAAAAGAAATTGACCGCCTCAATGAGGTCGCTA/iAAATTTAAATG - 2 5020 
-LSST FKKKLTASMRSLKI * M 
-CRQHSKRN*PPQ*GR*KFK* 
VVNI QKE IDRLNEVAKNLNE 
25021 - AATCACTCATTGACCTTCAAGAATTGGGAAAATATGAGCAATATATTAAATGGCCTTGGT - 25080 
-NHSLTFKNWENMSN ILNGLG 
-ITH*PSRIGKI*AIY*MALV 
SLIDLQELGKYEQYIKWPWY 
25081 - ATGTTTGGCTCGGCTTCATTGCTGGACTAATTGCCATCGTCATGGTTACAATCTTGCTTT - 25140 
~MFGSASLLD*LPSSWLQSCF 
-CLARLHCWTNCHRHGYNLAL 
VWLGFIAGLIAIVMVTILLC 
25141 - GTTGCATGACTAGTTGTTGCAGTTGCCTCAAGGGTGCATGCTCTTGTGGTTCTTGCTGCA - 25200 
~ V A * LVVAVASRVHALVVLAA 
-LHD*LLQLPQGCMLLWFLLQ 
CMTSCCSCLKGACSCGSCCK 



FIG. 11 Con't 



WO 2004/085633 



PCT/CN2004/000248 



49/90 

25201 - AGTTTGATGAGGATGACTCTGAGCCAGTTCTCAAGGGTGTCAAATTACATTACACATAAA - 25260 
-SLMRMTLSQFSRVSNYI THK 
-V**G*L*ASSQGCQITLHIN 
FDEDDSEPVLKGVKLHYT*T 

25261 - CGAAGTTATGGATTTGTTTATGAGATTTTTTACTCTTGGATCAATTACTGCACAGCCAGT - 25320 
-RTYGFVYEI FYSWINYCTAS 

- ELMDLFMRFFTLGS I TAQPV 

NLWICL*DFLLLDQLLHSQ* 
25321 - AAAAATTGACAATGCTTCTCCTGCAAGTACTGTTCATGCTACAGCAACGATACCGCTACA - 25380 

- K N * QCFSCKYCS CYSNDTAT 

- K I DNASPASTVHATATI PLQ 

KLTMLLLQVLFMLQQRYRYK 
25381 - AGCCTCACTCCCTTTCGGATGGCTTGTTATTGGCGTTGCATTTCTTGCTGTTTTTCAGAG - 254 40 
-SLTPFRMACYWRC ISCCFSE 
-ASLPFGWLVIGVAFLAVFQS 
PHSLSDGLLLALHFLLFFRA 
25441 - CGCTACCAAAATAATTGCGCTCAATAAAAGATGGCAGCTAGCCCTTTATAAGGGCTTCCA - 25500 
-RYQNNCAQ* KMAASP L * GLP 

- A T K I IALNKRWQLALYKGFQ 

LPK*LRSIKDGS*PFIRASS 
25501 - GTTCATTTGCAATTTACTGCTGCTATTTGTTACCATCTATTCACATCTTTTGCTTGTCGC - 25560 
-VHLQFTAAI CYHLFT S F A C R 
-FICNLLLLFVTIYSHLLLVA 
SFAIYCCYLLPSIHIFCLSL 
25561 - TGCAGGTAAGGAGGCGCAATTTTTGTACCTCTATGCCTTGATATATTXTCTACAATGCAT - 25620 

- C R * GGAIFVPLCLDI F S TMH 
-AGKEAQFLYLYALIYFLQCI 

QVRRRNFCTSMP* YI FYNAS 
25621 - CAACGCATGTAGAATTATTATGAGATGTTGGCTTTGTTGGAAGTGCAAATCCAAGAACCC - 25680 
-QRM * NYYEMLALLEVQI QEP 
-NACRIIMRCWLCWKCKSKNP 
T H V E L L * DVGFVGSANPRTH 
25 681 - ATTACTTTATGATGCCAACTACTTTGTTTGCTGGCACACACATAACTATGACTACTGTAT - 257 4 0 
-I T L * CQLLCLLAHT* L * L L Y 
-LLYDANYFVCWHTHNYDYCI 
YFMMPTTLFAGTHITMTTVY 
25741 - ACCATATAACAGTGTCACAGATACAATTGTCGTTACTGAAGGTGACGGCATTTCAACACC - 25800 

- T I *QCHRYNCRY*R*RHFNT 

PYNSVTDTIVVTEGDGI STP 
HITVSQIQLSLLKVTAFQHQ 
25801 - AAAACTCAAAGAAGACTACCAAATTGGTGGTTATTCTGAGGATAGGCACTCAGGTGTTAA - 25860 
~KTQRRLPNWWLF*G*ALRC* 
-KLKEDYQIGGYSEDRHS GVK 
NSKKTTKLVVILRIGTQVLK 
258 61 - AGACTATGTCGTTGTACATGGCTATTTCACCGAAGTTTACTACCAGCTTGAGTCTACACA - 2592 0 
-RLCRCTWLFHRSLLPA*VYT 

- DYVVVHGYFTEVYYQLE STQ 

TMSLYMAISPKFTTSLSLHK 
25921 - AATTACTACAGACACTGGTATTGAAAATGCTACATTCTTCATCTTTAACAAGCTTGTTAA - 2598 0 
-NYYRHWY*KCYILHL*QAC* 
ITTDTGIENATFFIFNKLVK 
LLQTLVLKMLHS SSLTSLLK 
25981 - AGACCCACCGAATGTGCAAATACACACAATCGACGGCTCTTCAGGAGTTGCTAATCCAGC - 26040 
-RPTECANTHNRRLFRSC^SS 
-DPPNVQIHTIDGSSGVANPA 
THRMCKYTQSTALQELLIQQ 
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26041 - AATGGATCCAATTTATGATGAGCCGACGACGACTACTAGCGTGCCTTTGTAAGCACAAGA - 2 6100 
-NGSNL* *ADDDY*RAFVSTR 
-MDPIYDEPTTTTSVPL*AQE 
WIQFMMSRRRLLACLCKHKK 

26101 - AAGTGAGTACGAACTTATGTACTCATTCGTTTCGGAAGAAACAGGTACGTTAATAGTTAA - 2 6160 

- K * VRTYVLIRFGRNRYVNS * 
-SEYELMYSFVSEETGTLIVN 

VSTNLCTHSFRKKQVR* * L I 
26161 - TAGCGTACTTCTTTTTCTTGCTTTCGTGGTATTCTTGCTAGTCACACTAGCCATCCTTAC - 2 6220 
-*RTSFSCFRGILASHTSHPY 
-SVLLFLAFVVFLLVTLAILT 
AYFFFLLSWYSC*SH*PSLL 
2 6221 - TGCGCTTCGATTGTGTGCGTACTGCTGCAATATTGTTAACGTGAGTTTAGTAAAACCAAC - 2 6280 
-CAS IVCVLLQYC*REFSKTN 
-ALRLCAYCCNIVNVSLVKPT 
RFDCVRTAAILLT*V**NQR 
26281 - GGTTTACGTCTACTCGCGTGTTAAAAATCTGAACTCTTCTGAAGGAGTTCCTGATCTTCT - 26340 
-GLRLLAC*KSELF*RSS*SS 

- VYVYSRVKNLNSSEGVPDLL 

FTSTRVLKI*TLLKEFLIFW 
26341 - GGTCTAAACGAACTAACTATTATTATTATTCTGTTTGGAACTTTAACATTGCTTATCATG - 26400 
-GLNELTI I I I LFGTLTLLIM 
-V* TN*LLLLFCLEL*HCLSW 
SKRTNYYYYSVWNFNIAYHG 
26401 - GCAGACAACGGTACTATTACCGTTGAGGAGCTTAAACAACTCCTGGAACAATGGAACCTA - 2 64 60 
-ADNGT I TVEELKQLLEQWNL 
-QTTVLLPLRSLNNSWNNGT* 
RQRYYYR*GA*TTPGTMEPS 
26461 - GTAATAGGTTTCCTATTCCTAGCCTGGATTATGTTACTACAATTTGCCTATTCTAATCGG - 26520 
-VI GFLFLAWIMLLQFAY SNR 
-**VSYS*PGLCYYNLPILIG 
NRFPIPSLDYVTTICLF* SE 
26521 - AACAGGTTTTTGTACATAATAAAGCTTGTTTTCCTCTGGCTCTTGTGGCCAGTAACACTT - 26580 
-NRFLYI IKLVFLWLLWPVTL 
-TGFCT** SLFSSGSCGQ*HL 
QVFVHNKACFPLALVASNTC 
26581 - GCTTGTTTTGTGCTTGCTGTTGTCTACAGAATTAATTGGGTGACTGGCGGGATTGCGATT - 26640 
-ACFVLAVVYRINWVTGG IAI 
-LVLCLLLSTELIG*LAGLRIi 
LFCACCCLQN*LGDWRDCDC 
2 6641 - GCAATGGCTTGTATTGTAGGCTTGATGTGGCTTAGCTACTTCGTTGCTTCCTTCAGGCTG - 2 67 00 
-AMAC IVGLMWLSYFVAS FRL 
~QWLVL*A*CGLATSLIjPSGC 
NGLYCRLDVA*LLRCFLQAV 
26701 - TTTGCTCGTACCCGCTCAATGTGGTCATTCAACCCAGAAACAAACATTCTTCTCAATGTG - 26760 
-FARTRSMWSFNPETNILLNV 
LLVPAQCGHSTQKQTFFSMC 
CSYPLNVVIQPRNKHSSQCA 
267 61 - CCTCTCCGGGGGACAATTGTGACCAGACCGCTCATGGAAAGTGAACTTGTCATTGGTGCT - 26820 
-PL R G T I VTRP LMES E LV I GA 
-LSGGQL* PDRSWKVNLSLVL 
SPGDNCDQTAHGK*TCHWCC 
2 6821 - GTGATCATTCGTGGTCACTTGCGAATGGCCGGACACTCCCTAGGGCGCTGTGACATTAAG - 26880 
-VIIRGHLRMAGHSLGRCDIK 

- + SFVVTCEWPDTP*GAVTLR 

DHSWSLANGRTLPRAL*H*G 
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26881 - GACCTGCCAAAAGAGATCACTGTGGCTACATCACGAACGCTTTCTTATTACAAATTAGGA - 26940 
-DLPKE ITVATSRTLSYYKLG 
-TCQKRSLWLHHERFLITN*E 
PAKRDHCGYITNAFLLQIRS 
26941 - GCGTCGCAGCGTGTAGGCACTGATTCAGGTTTTGCTGCATACAACCGCTACCGTATTGGA - 27000 
-ASQRVGT D SGFAAYNRYRI G 
-RRSV*ALIQVLLHTTATVLE 
V A A C R H * FRFCCIQPLPYWK 
27001 ~ AACTATAAATTAAATACAGACCACGCCGGTAGCAACGACAATATTGCTTTGCTAGTACAG - 27060 
-NYKLNT DHAGSNDN IALLVQ 
-TIN* IQTTPVATTILLC*YS 
L*IKYRPRR*QRQYCFASTV 
27061 - TAAGTGACAACAGATGTTTCATCTTGTTGACTTCCAGGTTACAATAGCAGAGATATTGAT - 27120 
-*VTTDVSSC*LPGYNSRDID 

- K * QQMFHLVDFQVTIAEILI 

SDNRCFILLTSRLQ*QRY*L 
27121 - TATCATTATGAGGACTTTCAGGATTGCTATTTGGAATCTTGACGTTATAATAAGTTCAAT - 27180 
-YHYEDFQDCYLES * R Y N K F N 
-IIMRTFRIAIWNLDVI ISSI 
SL*GLSGLI)FGILTL* * V Q * 
27181 - AGTGAGACAATTATTTAAGCCTCTAACTAAGAAGAATTATTCGGAGTTAGATGATGAAGA - 27240 
-SETII*ASN*EELFGVR**R 
-VRQLFKPLTKKNYSELDDEE 
* DNYLSL* LRRIIRS*MMKN 
27241 - ACCTATGGAGTTAGATTATCCATAAAACGAACATGAAAATTATTCTCTTCCTGACATTGA - 27300 
-TYGVRLS I KRT * K k F S S * H * 
-PMELDYP^NEHENYSLPDID 
LWS* IIHKTNMKI ILFLTLI 
27301 - TTGTATTTACATCTTGCGAGCTATATCACTATCAGGAGTGTGTTAGAGGTACGACTGTAC - 27360 
-LYLHLASY ITI RSVLEVRLY 
~CIYILRAISLSGVC*RYDCT 
VFTSCELYHYQECVRGTTVL 
27361 - TACTAAAAGAACCTTGCCCATCAGGAACATACGAGGGCAATTCACCATTTCACCCTCTTG - 27420 

- Y *KNLAHQEHTRAIHHFTLL 

- TKRTLPIRNIRGQFTI S PSC 

LKEPCPSGTYEGNSPFHPLA 
27421 - CTGACAATAAATTTGCAGTAACTTGCACTAGCACACACTTTGCTTTTGCTTGTGCTGACG - 27480 
-LTINLH *LALAHTLLLLVLT 
■-*Q*ICTNLH*HTLCFCLC*R 
DNKFALTCTSTHFAFACADG 
27481 - GTACTCGACATACCTATCAGCTGCGTGCAAGATCAGTTTCACCAAAACTTTTCATCAGAC - 27540 
-VLDI PI SCVQDQFHQNFSSD 
-YSTYLSAACKISFTKTFHQT 
TRHTYQLRARSVSPKLFIRQ 
27541 - AAGAGGAGGTTCAACAAGAGCTCTACTCGCCACTTTTTCTCATTGTTGCTGCTCTAGTAT - 27600 
-KRRFNKSSTRHFFSLLLL*Y 
-RGGSTRALLATFSHCCCSSI 
EEVQQELYS PLFLIVAALVF 
27601 - TTTTAATACTTTGCTTCACCATTAAGAGAAAGACAGAATGAATGAGCTCACTTTAATTGA - 27660 

- F * Y F A S PLRERQNE*AHFN* 
-FNTLLHH* EKDRMNELTLID 

LILCFTIKRKTE*MSSL*LT 
27661 - CTTCTATTTGTGCTTTTTAGCCTTTCTGCTATTCCTTGTTTTAATAATGCTTATTATATT - 27720 
-LLFVLFSLSAI PCFNNAYYI 
-FYLCFLAFLLFLVLIMLI IF 

SICAF*PFCYSLF**CLLYF 
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27721 - TTGGTTTTCACTCGAAATCCAGGATCTAGAAGAACCTTGTACCAAAGTCTAAACGAACAT - 27780 

- L V FTRNP G SRRTLYQS LNEH 
-WFSIiEIQDLEEPCTKV*TNM 

GFHSKSRI *knlvpkskrt* 
27781 - GAAACTTCTCATTGTTTTGACTTGTATTTCTCTATGCAGTTGCATATGCACTGTAGTACA - 27840 
-ETSHCFDLYFSMQLHMHCST 
-KLLIVLTCISLCSCICTVVQ 
NFSLF*LVFLYAVAYAL*YS 
27 841 - GCGCTGTGCATCTAATAAACCTCATGTGCTTGAAGATCCTTGTAAGGTACAACACTAGGG - 27 900 

- A L C I * *TSCA*RSL*GTTLG 
-RCASNKPHVLEDPCKVQH*G 

AVHL INLMCLKI LVRYNTRG 
27 901 - GTAATACTTATAGCACTGCTTGGCTTTGTGCTCTAGGAAAGGTTTTACCTTTTCATAGAT - 27960 
-VI LIALLGFVL* ERFY LFI D 
* Y L * HCLALCSRKGFTFS * M 
NTYSTAWLCALGKVLPFHRW 

27 961 - GGCACACTATGGTTCAAACATGCACAGCTAATGTTACTATCAACTGTCAAGATCCAGCTG - 28020 

-GTLWFKHAHLMLLSTVKIQL 
-AHYGSNMHT*CYYQLSRSSW 
HTMVQTCTPNVTINCQDPAG 

28 021 - GTGGTGCGCTTATAGCTAGGTGTTGGTACCTTCATGAAGGTCACCAAACTGCTGCATTTA - 28080 

-VVRL* LGVGTFMKVTKLLHL 
-WCAYS*VLVPS*RSPNCCI* 
GALIARCWYLHEGHQTAAFR 
28081 - GAGACGTACTTGTTGTTTTAAATAAACGAACAAATTAAAATGTCTGATAATGGACCCCAA - 28140 
-ETYLLF* INEQIKMSDNGPQ 
-RRTCCFK* TNKLKCLIMDPN 
DVLVVLNKRTN*NV**WTPI 
28141 - TCAAACCAACGTAGTGCCCCCCGCATTACATTTGGTGGACCCACAGATTCAACTGACAAT - 28200 
-SNQRSAPRITFGGPTDSTDN 
-QTNVVPPALHLVDPQIQLTI 
KPT*CPPHYIWWTHRFN*Q* 
28201 - AACCAGAATGGAGGACGCAATGGGGCAAGGCCAAAACAGCGCCGACCCCAAGGTTTACCC - 28260 

- N QNGGRNGARPKQRRPQGLP 
-TRMEDAMGQGQNSADPKVYP 

PEWRTQWGKAKTAPTPRFTQ 
28261 - AATAATACTGCGTCTTGGTTCACAGCTCTCACTCAGCATGGCAAGGAGGAACTTAGATTC - 28320 
-NNTAS WFTALTQHGKE ELRF 

- I I LRLGSQLSLSMARRNLDS 

*YCVLVHSSHSAWQGGT*IP 
28321 - CCTCGAGGCCAGGGCGTTCCAATCAACACCAATAGTGGTCCAGATGACCAAATTGGCTAC - 28380 
-PRGQGVP INTNSGPDDQIGY 
-LEARAFQSTPIVVQMTKLAT 
SRPGRSNQHQ*WSR* PNWLL 
28381 - TACCGAAGAGCTACCCGACGAGTTCGTGGTGGTGACGGCAAAATGAAAGAGCTCAGCCCC - 284 40 
-YRRATRRVRGGDGKMKELS P 
-TEELPDEFVVVTAK*KSSAP 
PKSYPTSSWW* RQNERAQPQ 
28441 - AGATGGTACTTCTATTACCTAGGAACTGGCCCAGAAGCTTCACTTCCCTACGGCGCTAAC - 28500 
-RWYFYYLGTG PEASL PYGAN 
DGTS IT*ELAQKLHFPTALT 
MVLLLPRNWPRSFTSLRR*Q 
28501 - AAAGAAGGCATCGTATGGGTTGCAACTGAGGGAGCCTTGAATACACCCAAAGACCACATT - 285 60 
-KEGIVWVATEGALNTPKDHI 
-KKASYGLQLREP * IHPKTTL 
RRHRMGCN*GSLEYTQRPHW 
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28561 ~ GGCACCCGCAATCCTAATAACAATGCTGCCACCGTGCTACAACTTCCTCAAGGAACAACA - 28620 
-GTRNPNNNAATVLQLPQGTT 
-APAILITMLPPCYNFLKEQH 
HPQS**QCCHRATTSSRNNI 
28621 - TTGCCAAAAGGCTTCTACGCAGAGGGAAGCAGAGGCGGCAGTCAAGCCTCTTCTCGCTCC - 28 680 
-LP KG FYAEGSRGGS QAS S RS 
-CQKASTQREAEAAVKPLLAP 
AKRLLRRGKQRRQS SL FSLL 
28 681 - TCATCACGTAGTCGCGGTAATTCAAGAAATTCAACTCCTGGCAGCAGTAGGGGAAATTCT - 28740 
-SSRSRGNSRNSTPGSSRGNS 
-HHVVAVIQEIQLLAAVGEIL 
IT* S R * FKKFNSWQQ* G K F S 
2 8741 - CCTGCTCGAATGGCTAGCGGAGGTGGTGAAACTGCCCTCGCGCTATTGCTGCTAGACAGA - 28800 
-PARMASGGGETALALLLL DR 
-LLEWLAEVVKLPSRYCC* TD 
CSNG*RRW*NCPRAIAARQI 
28 801 - TTGAACCAGCTTGAGAGCAAAGTTTCTGGTAAAGGCCAACAACAACAAGGCCAAACTGTC - 28 860 

- L N QLESKVS GKGQQQQGQTV 

- * TSLRAKFLVKANNNKAKLS 

EPA*EQSFW*RPTTTRPNCH 
2 88 61 - ACTAAGAAATCTGCTGCTGAGGCATCTAAAAAGCCTCGCCAAAAACGTACTGCCACAAAA - 28 920 
-TKKSAAEASKKPRQKRTATK 

- LRNLLLRHLK SLAKNVLPQN 

*EICC*GI*KASPKTYCHKT 
28921 - CAGTACAACGTCACTCAAGCATTTGGGAGACGTGGTCCAGAACAAACCCAAGGAAATTTC - 28980 
-QYNVTQAFGRRGPEQTQGNF 
-STTSLKHLGDVVQNKPKEIS 
VQRHSS IWETWSRTNPRKFR 
2 8981 - GGGGACCAAGACCTAATCAGACAAGGAACTGATTACAAACATTGGCCGCAAATTGCACAA - 29040 
-GDQDLIRQGTDYKHWPQ IAQ 

- G T K T * SDKELITNIGRKLHN 

GPRPNQTRN*LQTLAANCTI 
2 9041 - TTTGCTCCAAGTGCCTCTGCATTCTTTGGAATGTCACGCATTGGCATGGAAGTCACACCT - 29100 
-FAPSASAFFGMSRI GMEVTP 
-LLQVPLHSLECHALAWKSHL 
CSKCLCILWNVTHWHGSHTF 
29101 - TCGGGAACATGGCTGACTTATCATGGAGCCATTAAATTGGATGACAAAGATCCACAATTC - 29160 
-SGTWLTY HGAIKLDDKDPQF 
-REHG*LIMEPLNWMTKIHNS 
GNMADLSWSH* IG*QRSTIQ 
29161 - AAAGACAACGTCATACTGCTGAACAAGCACATTGACGCATACAAAACATTGCCACCAACA - 29220 
-KDNV ILLNKHI DAYKTFPPT 
-KTTSYC*TSTLTHTKHSHQQ 
RQRHTAEQAH*RIQNI PTNR 
2 9221 - GAGCCTAAAAAGGACAAAAAGAAAAAGACTGATGAAGCTCAGCCTTTGCCGCAGAGACAA - 29280 
-EPKKDKKKKTDEAQPLPQRQ 

- SLKRTKRKRLMKLSLCRRDK 

A* KGQKEKD* * SSAFAAETK 
29281 - AAGAAGCAGCCCACTGTGACTCTTCTTCCTGCGGCTGACATGGATGATTTCTCCAGACAA - 29340 
-KKQPTVTLLPAADMDDFSRQ 
-RSSPL*LFFLRLTWMISPDN 
EAAHCDSSSCG*HG*FLQTT 
2 9341 - CTTCAAAATTCCATGAGTGGAGCTTCTGCTGATTCAACTCAGGCATAAACACTCATGATG - 29400 
-LQNSMSGASADSTQA* TLMM 

- F K I P*VELLLIQLRHKHS** 

SKFHEWSFC* FNSGIN THDD 
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29401 - ACCACACAAGGCAGATGGGCTATGTAAACGTTTTCGCAATTCCGTTTACGATACATAGTC - 29460 
-TTQGRWAM* TFSQFRLRYIV 
PHKADGLCKRFRNSVYDT*S 
HTRQMGYVNVFAIPFT IHSL 
2 94 61 - TACTCTTGTGCAGAATGAATTCTCGTAACTAAACAGCACAAGTAGGTTTAGTTAACTTTA - 2 9520 
-YSCAB * ILVTKQHK*V* LTL 
-TLVQNEFS*LNSTSRFS*L* 
LLCRMNSRN*TAQV-GLVNFN 
29521 - ATCTCACATAGCAATCTTTAATCAATGTGTAACATTAGGGAGGACTTGAAAGAGCCAGCA - 29580 
-ISHSNL*SMCNI REDLKEPP 
-SHIAI FNQCVTLGRT* KSHH 
LT*QSLINV*H* GGLERATT 
29581 - CATTTTCATCGAGGCCACGCGGAGTACGATCGAGGGTACAGTGAATAATGCTAGGGAGAG - 29640 
-HFHRGHAEYDRGYSE * C * G E ■ 
I FIEATRSTIEGTVNNARES 
FSSRPRGVRSRVQ* IMLGRA 
29641 - CTGCCTATATGGAAGAGCCCTAATGTGTAAAATTAATTTTAGTAGTGCTATCCCCATGTG - 29700 
-LPIWKSPNV*N* F * *CYPHV 
- CLYGRALMCKINFSSAIPM* 
AYMEEP*CVKLILVVLSPCD 
2 9701 - ATTTTAATAGCTTCTTAGGAGAATGACAAAAAAAAAAAAAAA - 29742 
-I LIAS ^ENDKKKKKX 
F * *LLRRMTKKKKX 
FNS FLGE*QKKKKX 
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1 - TTTTTTTTTTTTTTTGTCATTCTCCTAAGAAGCTATTAAAATCACATGGGGATAGCACTA - 60 

- F F F F F V I LLRSY*NHMG IAL 
-FFFFLSFS*EAIKITWG±HY 

FFFFCHSPKKLLKSHGDSTT 
61 - CTAAAATTAATTTTACACATTAGGGCTCTTCCATATAGGCAGCTCTCCCTAGCATTATTC - 120 
-LKLILHI RALPYRQLS LALF 
*N*FYTLGLFHIGSSP*HYS 
KINFTH*GSSI*AALPSIIH 
121 * ACTGTACCCTCGATCGTACTCCGCGTGGCCTCGATGAAAATGTGGTGGCTCTTTCAAGTC - 180 

- T V P S IVLRVASMKMWWL FQV 
-LYPRSYSAWPR*KCGGSFKS 

CTLDRTPRGLDENVVALSSP 
181 - CTCCCTAATGTTACACATTGATTAAAGATTGCTATGTGAGATTAAAGTTAACTAAACCTA - 240 
-LPNVTH* LKIAM^D* S * LNL 

- SLMLHID*RLLCEIKVN*TY 

p * CYTLIKDCYVRLKL TKPT 
241 - CTTGTGCTGTTTAGTTACGAGAATTCATTCTGCACAAGAGTAGACTATGTATCGTAAACG - 300 
-LVLFSYENSFCTRVDYVS*T 
-LCCLVTRIHSAQE* TMYRKR 
CAV^LREFILHKSRLC IVNG 
301 - GAATTGCGAAAACGTTTACATAGCCCATCTGCCTTGTGTGGTCATCATGAGTGTTTATGC - 360 
-ELRKRLHSPSALCGHHE CLC 
-NCENVYIAHLPCVVIMSVYA 
IAKTFT*PICL>VWSS*VFMP 
361 - CTGAGTTGAATCAGCAGAAGCTCCACTCATGGAATTTTGAAGTTGTCTGGAGAAATCATC - 420 
-LS*ISRSS THGILKLSGEII 
*VESAEAPLMEF* SCJjEKSS 
ELNQQKLHSWNFEVVWRNHP 
421 - CATGTCAGCCGCAGGAAGAAGAGTCACAGTGGGCTGCTTCTTTTGTCTCTGCGGCAAAGG - 480 
-HVSRRKKSHSGLLLL SLRQR 
-MSAAGRRVTVGCFFCLCGKG 
CQPQEEESQWAAS FVSAAKA 
4 81 - CTGAGCTTCATCAGTCTTTTTCTTTTTGTCCTTTTTAGGCTCTGTTGGTGGGAATGTTTT - 540 
-LSFISLFLFVLFRLCWWECF 
*ASSVFFFLSFLGSVGGNVL 
ELHQSFSFCPF*ALLVGMFC 
541 - GTATGCGTCAATGTGCTTGTTCAGCAGTATGACGTTGTCTTTGAATTGTGGATCTTTGTC - 600 
-VCVNVLVQQYDVVFELWIFV 
-YASMCLFSSMTLSLNCGSLS 
MRQCACSAV*RCL*IVDLCH 
601 - ATCCAATTTAATGGCTCCATGATAAGTCAGCCATGTTCCCGAAGGTGTGACTTCCATGCC - 660 
-IQFNGSM I SQPCSRRCDFHA 
~SNLMAP**VSHVPEGVTSMP 
PI^WLHDKSAMFPKV*IiPCQ 
661 - AATGCGTGACATTCCAAAGAATGCAGAGGCACTTGGAGCAAATTGTGCAATTTGCGGCCA - 720 
~NA*HSKECRGTWSKLCNLRP 

- M R D I PKNAEALGANCAICGQ 

CVTFQRMQRHLEQIVQFAAN 
721 - ATGTTTGTAATCAGTTCCTTGTCTGATTAGGTCTTGGTCCCCGAAATTTCCTTGGGTTTG - 7 80 
-MFVISSLSD*VLVPEISLGL 
-CL*SVPCLIRSWSPKFPWVC 
VCNQFLV* LGLGPRN FLGFV 
781 - TTCTGGACGACGTCTCGCAAATGCTTGAGTGACGTTGTACTGTTTTGTGGCAGTACGTTT - 8 40 
-FWTTSPKCLS DVVLFCGSTF 

- SGPRLPNA*VTLYCFVAVRF 

LDHVSQMLE*RCTVLWQYVF 
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841 - TTGGCGAGGCTTTTTAGATGCCTCAGCAGCAGATTTCTTAGTGACAG'TTTGGCCTTGTTG - 900 
-LARLFRCLSSRFLSDSLALL 
-WRGFLDASAADFLVTVWPCC 
GBAF*MPQQQIS**QF GLVV 

901 - TTGTTGGCCTTTACCAGAAACTTTGCTCTCAAGCTGGTTCAATCTGTCTAGCAGCAATAG - 960 
-LLAFTRNFALKLVQSV*QQ* 
-CWPLPETLLSSWFNLSSSNS 
VGLYQKLC SQAGS ICLAAIA 

961 - CGCGAGGGCAGTTTCACCACCTCCGCTAGCCATTCGAGCAGGAGAATTTCCCCTACTGCT - 1020 
-REGS FTTSASHSSRRI S PTA 

- A R A V S PP PLAIRAGEFPLLL 

RGQFHHLR* PFEQENFPYCC 
1021 - GCCAGGAGTTGAATTTCTTGAATTACCGCGACTACGTGATGAGGAGCGAGAAGAGGCTTG - 1080 
-ARS*IS*ITATT**GARRGL 
-PGVEFLELPRLRDEEREEA* 
QELNFLNYRDYVMRSEKRLD 
1081 - ACTGCCGCCTCTGCTTCCCTCTGCGTAGAAGCCTTTTGGCAATGTTGTTCCTTGAGGAAG - 1140 

- T A A S A S LCVEAFWQCC SLRK 
-LPPLLPSA*KPFGNVVP*GS 

CRIiCFPLRRSLLAMLFLEEV 
1141 - TTGTAGCACGGTGGCAGCATTGTTATTAGGATTGCGGGTGCCAATGTGGTCTTTGGGTGT - 1200 

- L * HGGS IV IRIAGANVVFGC 
-CSTVAALLLGLRVPMWSLGV 

VARWQHCY * DCGCQCGLWVY 
12 01 - ATTCAAGGCTCCCTCAGTTGCAACCCATACGATGCCTTCTTTGTTAGCGCCGTAGGGAAG - 12 60 
-IQGSLSCNPYDAFFVSAVGK 

- FKAPSVATHTMP SLLAP * GS 

SRLPQLQPIRCLLC*RRREV 
1261 - TGAAGCTTCTGGGCCAGTTCCTAGGTAATAGAAGTACCATCTGGGGCTGAGCTCTTTCAT - 1320 
-*SFWASS*VrEVPSGAELFH 
-EASGPVPR* *KYHLGLSSFI 
KLLGQFLGNRSTIWG*ALSF 
1321 - TTTGCCGTCACGACCACGAACTCGTCGGGTAGCTCTTCGGTAGTAGCCAATTTGGTCATC - 1380 
-FAVTTTNSSGSSSVVANLVI 
~LPSPPRTRRVALR**PIWSS 
CRHHHELVG*LFGSSQFGHL 
1381 - TGGACCACTATTGGTGTTGATTGGAACGCCCTGGCCTCGAGGGAATCTAAGTTCCTCCTT - 14 40 
-WTTIGVDWNALASRESKFLL 
-GPLLVLIGTPWPRGNLSSSL 
DHYWC*LERPGLEGI*VPPC 
1441 - GCCATGCTGAGTGAGAGCTGTGAACCAAGACGCAGTATTATTGGGTAAACCTTGGGGTCG - 1500 
-AMLSESCEPRRS I I G * TLGS 

- PC*VRAVNQDAVLLGKPWGR 

HAE*EL*TKTQYYWVNLGVG 
1501 - GCGCTGTTTTGGCCTTGCCCCATTGCGTCCTCCATTCTGGTTATTGTCAGTTGAATCTGT - 1560 
-ALFWPCPIASSILVIVS^IC 
-RCFGLAPLRPPFWLLSVESV 
AVLALPHCVLHSGYCQLNLW 
1561 - GGGTCCACCAAATGTAATGCGGGGGGCACTACGTTGGTTTGATTGGGGTCCATTATCAGA - 1620 
-GS TKCNAGGTTLV*LGS I IR 
-GPPNVMRGALRWFDWGPLSD 
VHQM*CGGHYVGLIGVHYQT 
1621 - CATTTTAATTTGTTCGTTTATTTAAAACAACAAGTACGTCTCTAAATGCAGCAGTTTGGT - 1680 
-HFNLFVYLKQQVRL*MQQFG 
-ILICSFI*NNKYVSKCSSLV 
F* FVRLFKTTSTSLNAAVW* 
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1681 - GACCTTCATGAAGGTACCAACACCTAGCTATAAGCGCACCACCAGCTGGATCTTGACAGT - 1740 
-DLHEGTNT* L*AHHQLDLDS 
-TFMKVPTP SYKRTTSWILTV 
PS*RYQHLAISAPPAGS * Q L 
17 41 - TGATAGTAACATTAGGTGTGCATGTTTGAACCATAGTGTGCCATCTATGAAAAGGTAAAA - 18 00 
-***H*VCMFEP*CAIYEKVK 
-DSN IRCACLNHSVPSMKR*N 
IVTLGVHV* TIVCHL*KGKT 
1801 - CCTTTCCTAGAGCACAAAGCCAAGCAGTGCTATAAGTATTACCCCTAGTGTTGTACCTTA - 18 60 
-PFLEHKAKQCYKYYF*CCTL 
-LS^STKPSSAISITPSVVPY 
FPRAQSQAVL*VLPLVLYLT 
1861 - CAAGGATCTTCAAGCACATGAGGTTTATTAGATGCACAGCGCTGTACTACAGTGCATATG - 1920 
-QGS SST * GLLDAQRCTTVHM 
KDLQAHEVY *MHSAVLQC IC 
RIFKHMRFIRCTALYYSAYA 
1921 - CAACTGCATAGAGAAATACAAGTCAAAACAATGAGAAGTTTCATGTTCGTTTAGACTTTG - 1980 
-QLHRE I QVKTMRSFMFV * T L 
-NCIEKYKSKQ*EVSCSFRLW 
TA*RNTSQNNEKFHVRLDFG 
1981 - GTACAAGGTTCTTCTAGATCCTGGATTTCGAGTGAAAACCAAAATATAATAAGCATTATT - 204 0 
-VQGSSRSWISSENQNII SII 
-YKVLLDPGFRVKTKI* * A L L 
T R F F * ILDFE*KPKYNKHY* 
2 041 - AAAACAAGGAATAGCAGAAAGGCTAAAAAGCACAAATAGAAGTCAATTAAAGTGAGCTCA - 2100 
-KTRNS RKAKKHK* KS IKVS S 

- KQGIAERLKSTNRSQLK*AH 

NKE*QKG*KAQIEVN*SELI 
2101 - TTCATTCTGTCTTTCTCTTAATGGTGAAGCAAAGTATTAAAAATACTAGAGCAGCAACAA - 2160 
-FILSFS * W * SKVLKI LEQQQ 
-SFCLSLNGEAKY*KY*SSNN 
HSVFLLMVKQSIKNTRAATM 
2161 - TGAGAAAAAGTGGCGAGTAGAGCTCTTGTTGAACCTCCTCTTGTCTGATGAAAAGTTTTG - 2220 
-*EKVASRALVEPPLV**KVL 

- EKKWRVELLLNLLLSDEKFW 

RKSGE*SSC*TSSCLMKSFG 
2221 - GTGAAACTGATCTTGCACGCAGCTGATAGGTATGTCGAGTACCGTCAGCACAAGCAAAAG - 228 0 
-VKLILHAADR YVEY RQHKQK 
-*N*SCTQLIGMSSTVSTSKS 
ETDLARS* *VCRVPSAQAKA 
2281 - CAAAGTGTGTGCTAGTGCAAGTTAGTGCAAATTTATTGTCAGCAAGAGGGTGAAATGGTG - 2340 
-QSVC*CKLVQIYCQQEGEMV 
-KVCASAS*CKFIVSKRVKW* 
KCVLVQVSANLLSARG* NGE 
2341 - AATTGCCCTCGTATGTTCCTGATGGGCAAGGTTCTTTTAGTAGTACAGTCGTACCTCTAA - 2 400 
-NCPRMFLMGKVLLVVQS YL* 
-IALVCS*WARFF**YSRTSN 
LPSYVPDGQGSFSSTVVPLT 
24 01 - CACACTCCTGATAGTGATATAGCTCGCAAGATGTAAATACAATCAATGTCAGGAAGAGAA - 24 60 

- H T P DSDIARKM* IQSMSGRE 
-TLLIVI * LARCKY NQCQEEN 

HS***Y'SSQDVNTINVRKRI 
2 4 61 - TAATTTTCATGTTCGTTTTATGGATAATCTAACTCCATAGGTTCTTCATCATCTAACTCC - 2520 
-*FSCSFYG*SNSIGSSSSNS 
-*NFHVRFMDNLTP*VIiHHLTP 
IFMFVLWI I*LHRFFII * L R 
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2521 - GAATAATTCTTCTTAGTTAGAGGCTTAAATAATTGTCTCACTATTGAACTTATTATAACG - 2580 

- E * FFLVRGLNNCLTIBL I IT 
-NNSS*LEA*IIVSLLNLL*R 

IILLS *RLK*LSHY*TYYNV 
2 581 - TCAAGATTCCAAATAGCAATCCTGAAAGTCCTCATAATGATAATCAATATCTCTGCTATT - 2640 

- S R F Q IAILKVLIMI INI SAI 
-QDSK*QS*KSS***SISLLL 

KIPNSNPESPHNDNQYLCYC 
2 641 - GTAACCTGGAAGTCAACAAGATGAAACATCTGTTGTCACTTACTGTACTAGCAAAGCAAT - 2700 
-VTWKSTR* N ICCHLLY* QSN 
* PGSQQDETSVVTYCTSKAI 
NLEVNKMKHLLSLTVLAKQY 
2701 - ATTGTCGTTGCTACCGGCGTGGTCTGTATTTAATTTATAGTTTGCAATACGGTAGCGGTT - 27 60 
-IVVATGVVCI*FIVSNTVAV 

- LSLLPAWSVFNL*FPIR*RL 

CRCYRR GLYLIYSFQYGSGC 
2761 - GTATGCAGCAAAACCTGAATCAGTGCCTACACGCTGCGACGCTCCTAATTTGTAATAAGA - 2820 
-VCSKT*ISAYTLRRS*FVIR 
-YAAKPESVPTRCDAPNL* * E 
MQQNLNQCLHAATLLI CNKK 
2 821 - AAGCGTTCGTGATGTAGGCACAGTGATCTCTTTTGGCAGGTCCTTAATGTCACAGCGCCC - 2880 
-KRS * C S H S DLFWQVLNVTAP 
SVRDVATVISFGRSLMSQRP 
A F V M * PQ*SLLAGP*CHSAlr 
2881 - TAGGGAGTGTCCGGCCATTCGCAAGTGACCACGAATGATCACAGCACCAATGACAAGTTC - 2940 

- * GVSGHSQVTTN DHST 'N DK F 
-RECPAIRK* PRMITAPMTSS 

GSVRPFASDHE*SQHQ*QVH 
2 941 - ACTTTCCATGAGCGGTCTGGTCACAATTGTCCCCCGGAGAGGCACATTGAGAAGAATGTT - 3000 
-TFHERSGHNCPPERHIEKNV 
-LSMSGLVTIVPRRGTLRRMF 
FP*AVWSQLSPGEAH*EECL 
3001 - TGTTTCTGGGTTGAATGACCACATTGAGCGGGTACGAGCAAACAGCCTGAAGGAAGCAAC - 3060 

- C F W V E * PH*AGTSKQPEGSN 
-VSGLNDHIERVRANSLKEAT 

FLG*MTTLSGYEQTA* RKQR 
3061 - GAAGTAGCTAAGCCACATCAAGCCTACAATACAAGCCATTGCAATCGCAATCCCGCCAGT - 312 0 
-EVAKPHQAYNTS HCNR'N PAS 
-K*LSHIKPTIQAIAIAIPPV 
SS*ATSSLQYKPLQSQSRQS 
3121 - CACCCAATTAATTCTGTAGACAACAGCAAGCACAAAACAAGCAAGTGTTACTGGCCACAA - 3180 
-HPINSVDNSKHKTSKCYWPQ 
-TQLIL*TTASTKQASVTGHK 
P N * FCRQQQAQNKQ VLLATR 
3181 - GAGCCAGAGGAAAACAAGCTTTATTATGTACAAAAACCTGTTCCGATTAGAATAGGCAAA - 324 0 
-EPEENKLYYVQKPVPIR IGK 
-SQRKTSFIMYKNLFRLE*AN 
ARGKQALLCTKTCSD*NRQI 
3241 - TTGTAGTAACATAATCCAGGCTAGGAATAGGAAACCTATTACTAGGTTCCATTGTTCCAG - 3300 

- L * * HNPG*E*ETYY*V PLFQ 

- C S N I IQARNRKPITRFHCSR 

V V T * SRLGIGNLLLGSIVPG 
3301 - GAGTTGTTTAAGCTCCTCAACGGTAATAGTACCGTTGTCTGCCATGATAAGCAATGTTAA - 3360 
-ELFKLLNGNSTVVCHDKQC* 
-SCLSSSTVIVPLSAMISNVK 
VV*APQR**YRCLP**AMLK 
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3361 - AGTTCCAAACAGAATAATAATAATAGTTAGTTCGTTTAGACCAGAAGATCAGGAACTCCT - 3420 
-SSKQNNNNS* F V * TRRSGTP 
-VPNRIIIIVSSFRPEDQELL 
F Q T E * * * *LVRLDQKIRNSF 

3421 - TCAGAAGAGTTCAGATTTTTAACACGCGAGTAGACGTAAACCGTTGGTTTTACTAAACTC - 3 4 80 
-SEEFRFLTRE*T*TVGFTKL 
-QKS S DF*HASRRKPLVLLNS 
RRVQIFNTRVDVNRWFY*TH 

34 81 - ACGTTAACAATATTGCAGCAGTACGCACACAATCGAAGCGCAGTAAGGATGGCTAGTGTG - 354 0 
-TLTILQQYAHNRSAVRMASV 

- R * QYCSSTHTIEAQ*GWLV* 

VNNIAAVRTQSKRSKDG^CD 
3541 - ACTAGCAAGAATACCACGAAAGCAAGAAAAAGAAGTACGCTATTAACTATTAACGTACCT - 3600 
-TSKNTTKARKRSTLLT INVP 

- L A R I PRKQEKEVRY*LLTYL 

*QEYHESKKKKYAINY*RTC 
3601 - GTTTCTTCCGAAACGAATGAGTACATAAGTTCGTACTCACTTTCTTGTGCTTACAAAGGC - 3660 

- V S S ETNEY IS SYSLSCAYKG 

FLPKRMST *VRTHFLVLTKA 
FFRNE*VHKFVLTFLCIiQRH 
3661 - ACGCTAGTAGTCGTCGTCGGCTCATCATAAATTGGATCCATTGCTGGATTAGCAACTCCT - 3720 
-TLVVVVGSS* IGSIAGLATP 
-R**SSSAHHKLDPLLD*QLL 
ASSRRRLIINWIHCWISNS* 
3721 - GAAGAGCCGTCGATTGTGTGTATTTGCACATTCGGTGGGTCTTTAACAAGCTTGTTAAAG - 3780 
-EEPSIVCICTFGGSLTSLLK 
KSRRLCVFAHSVGIi*QAC*R 
RAVDCVYLHIRWVFNKLVKD 
3781 - ATGAAGAATGTAGCATTTTGAATACCAGTGTCTGTAGTAATTTGTGTAGACTCAAGCTGG - 3840 
-MKNVAFSI PVSVVICV DSSW 
*RM*HFQYQCL* * F V * TQAG 
EECSIFNTSVCSNLCRLKLV 
3841 - TAGTAAACTTCGGTGAAATAGCCATGTACAACGACATAGTCTTTAACACCTGAGTGCCTA - 390 0 
-**TSVK*PCTTT*SLTPECL 
-SKLR*NSHVQRHSL*HLSAY 
VNFGEIAMYNDIVFNT*VPI 
3901 - TCCTCAGAATAACCACCAATTTGGTAGTCTTCTTTGAGTTTTGGTGTTGAAATGCCGTCA - 3960 
-SSE*PPIW*SSLSFGVEMPS 
PQNNHQFGSLL* VLVLKCRH 
LRITTNLVVFFEFWC*NAVT 
3961 - CCTTCAGTAACGACAATTGTATCTGTGACACTGTTATATGGTATACAGTAGTCATAGTTA - 4020 
-PSVTTIVSVTLLYGIQ*S*L 
~LQ*RQLYL*HCYMVYSSHSY 
FSNDNCICDTVIWYTVVIVM 
4021 - TGTGTGTGCCAGCAAACAAAGTAGTTGGCATCATAAAGTAATGGGTTCTTGGATTTGCAC - 4 080 
-CVCQQTK* LAS* SNGFLDLH 
-VCASKQSSWHHKVMGSWICT 
CVPANKVVGIIK*WVLGFAL 
4 081 - TTCCAACAAAGCCAACATCTCATAATAATTCTACATGCGTTGATGCATTGTAGAAAATAT - 4140 
-FQQSQHLI I ILHALMHCRKY 
-SNKANIS**FYMR*CIVENI 
PTKPTSHNNSTCVDAL*KIY 
4141 - ATCAAGGCATAGAGGTAC2\AAAATTGCGCCTCCTTACCTGCAGCGACAAGCAAAAGATGT - 4200 
-I K A * RYKNCASL PAATSKRC 
SRHRGTKIAPPYLQRQAKDV 
QGI EVQKLRLLTCSDKQKM* 
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4201 - GAATAGATGGTAACAAATAGCAGCAGTAAATTGCAAATGAACTGGAAGCCCTTATAAAGG - 4260 
-E*MVTNSSSKLQMNWKPI»*R 
-NRW*QIAAVNCK* TGSPYKG 
I DGNK*QQ* IANELEALIKG 

4261 - GCTAGCTGCCATCTTTTATTGAGCGCAAT TATTTTGGTAGCGCTCTGAAAAACAGCAAGA - 4320 
-ASCHLLLSAI ILVAL*KTAR 
-LAAIFY*AQIjFW*RSEKQQE 
* L P S FIERNYFGSALKNSKK 

4321 - AATGCAACGCCAATAACAAGCCATCCGAAAGGGAGTGAGGCTTGTAGCGGTATCGTTGCT - 4380 
-NATPITSHPKGSEACSGIVA 

- M Q R Q * QAIRKGVRLVAVSLL 

CNANNKPSERE* GL*RYRCC 
4 381 - GTAGCATGAACAGTACTTGCAGGAGAAGCATTGTCAATTTTTACTGGCTGTGCAGTAATT - 4 4 40 

- V A * TVLAGEALS I FTGCAVI 

- * REQYLQEKKCQFLLAVQ*L 

SMNSTCRRSIVNFYWLCSN* 
4441 - GATCCAAGAGTAAAAAATCTCATAAACAAATCCATAAGTTCGTTTATGTGTAATGTAATT - 4 500 
-DPRVKNLINKSISSFMCNVI 
-IQE*KIS*TNP*VRLCVM*F 
SKSKKSHKQIHKFVYV*CNL 
4501 - TGACACCCTTGAGAACTGGCTCAGAGTCATCCTCATCAAACTTGCAGCAAGAACCACAAG - 4560 

- * H P * ELAQSHPHQTCSKNHK 
-DTLENWLRVILIKLAARTTR 

TPLRTGSESSSSNLQQEPQE 
4 561 - AGCATGCACCCTTGAGGCAACTGCAACAACTAGTCATGCAACAAAGCAAGATTGTAACCA - 4 620 
-SMEP * GNCNM* SCNKARL* P 
-ACTLEATATTSHATKQDCNH 
HAPLRQLQQLVMQQSKIVTM 
4 621 - TGACGATGGCAATTAGTCCAGCAATGAAGCCGAGCCAAACATACCAAGGCCATTTAATAT - 4 680 
-*RWQLVQQ* SRAKHTKAI * Y 
-DDGN*SSNEAEPNIPRPFNI 
TMAI S PAMKPSQTYQGHLIY 
4681 - ATTGCTCATATTTTCCCAATTCTTGAAGGTCAATGAGTGATTCATTTAAATTTTTAGCGA - 47 40 

- I A H I FPILEGQ*VIHLNF*R 
-LLIFSQFLKVNE* FI * IFSD 

CSYFPNS*RSMSDSFKFLAT 
4741 - CGTCATTGAGGCGGTCAATTTCTTTTTGAATGTTGACGACAGAAGCGTTAATGCCTGAAA - 4800 

- P H *GGQFLFEC*RQKR* CLK 
-LIEAVNFFLNVDDRSVNA*N 

SLRRS ISF^MLTTEALMPEM 
4 801 - TGTCGCCAAGATCAACATCTGGTGATGTATGATTTTTGAAGTACTTGTCCAGCTCTTCTT - 4 860 
-CRQDQHLVMYDF* STCPALL 
-VAKINIW* CMIFEVLVQLFF 
-^SPRSTSGDV*FLKYLSSSSL 
4 8 61 - TGAATGAGTCAAGCTCAGGTTGCAGAGGATCATAAACTGTGTTGTTAATGATGCCAATAA - 4 920 
~*MSQAQVAEDHKLCC* * C Q * 
-E^VKLRLQRI INCVVNDANN 
NESS SGCRGS*TVLLMMPIT 
4 921 - CGACATCACAATTTCCTGAGACAAATGTATTGTCTGTAGT7VATTATTTGTGGAGAAAAGA - 4 9 80 
-RHHN FLRQMYCL* * L F V E K R 
-DITIS*DKCIVCSNYLWRKE 
TSQFPETNVLSVVI ICGEKK 
4981 - AGTTCCTCTGTGTAATAAACCAAGAAGTGCCATTAAACACAAAAACACCTTCACGAGGGA - 5040 
-SSSV* *TKKCH*TQKHLHEG 
-VPLCNKPRSAIKHKNTFTRE 
FLCV INQEVPLNTKTP SRGK 
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5041 - AGTATGCTTTGCCTTCATGACAAATTGCTGGCGCTGTGGTGAAGTTCCTCTCCTGGGATG - 5100 
-SMLCLHDKLLALW* SS S PGM 
VCFAFMTNCWRCGEVPLLGW 
YALPS *QIAGAVVKFLSWDG 
5101 - GCACATACGTGACATGTAGGAAGACAACACCATGCGGGGCTGCTTGTGGGAAGGACATAA - 5160 

- A H T * HVGRQHHAGLLVGRT* 
-HIRDM*SDNTMRGCLWEGHK 

TYVTCRKTTPCGAACGKDIR 
5161 - GGTGGTAGCCCTTTCCACAAAAGTCAACTCTTTTTGATTGTCCAAGAACACACTCAGACA - 5220 
-GGSPFHKSQLFLIVQEHTQT 
-VVALSTKVNSF*LSKNTLRH 
W * PFPQKSTLFDCPRTHSDI 
5221 - TTTTAGTAGCAGCAAGATTAGCAGAAGCCCTGATXTCAGCAGCCCTGATTAGTTGTTGTG - 5280 

- F * * QQD* QKP*FQQP * LVVV 
-FSSSKISRSPDFSSPD*LLC 

LVAARLAEALI SAALI SCCV 
5281 - TTACATAGGTTTGAAGGCTTTGAAGTCTGCCTGTAATTAACCTGTCAATTTGTACCTCCG - 5340 
-LHRFEGFEVCL* LTCQFVPP 

- YIGLKALKSA'CW*PVNLYLR 

T*V*RL*SLPVINLSICTSA 
5341 - CCTCGACTTTATCAAGTCGCGAAAGGATATCATTTAGCACACTTGAAATTGCACCAAAAT - 54 00 
-PRLYQVAKGYHLAHLKLHQN 

- LDFI KSRKDI I *HT*NCTKI 

STLSSRERISFSTLEIAPKL 
5401 - TAGAGCTAAGTTGTTTAACAAGTGTGTTTAATGCTTGAGCATTCTGGTTAACAACGTCTT - 54 60 
~*S*VV-*QVCLMLEHSG*QRL 

- RAKLFNKCV*CLS ILVNNVL 

ELSCLTSVFNA*AFWLTTSC 
5 4 61 - GCAGCTTGCCCAATGCAGTTGATGTTGTTGTAAGTGATTCTTGAATTTGACTAATCGCCT - 5520 
-AACPMQLMLL*VILEFD* SP 
~QLAQCS*CCCK*FLNLTNRL 
SLPNAVDVVVSDS*I*LIAL 
5521 - TGTTAAATTGGTTGGCGATTTGTTTTTGGTTCTCATAGAGAACATTTTGGGTAACTCCAA - 55 80 

- C * IGWRFVFGSHREHFG* LQ 
-VKLVGDLFLVLIENI LGNSN 

LNWLAICFWFS*RTFWVTPM 
5581 - TGCCATTGAACCTATATGCGATTTGCATAGCAAAAGGTATTTGAAGAGCAGCGCCAGCAC - 5640 
-CH*TYMPFA*QKVFEEQRQH 
-AIEPICHLHSKRYLKSSAST 
PLNLYAICIAKGI* R A A P A P 
5 641 - CAAATGTCCATCCAGCAGTGGCAGTACCACTAACTAGAGCAGCAGTGTAGGCAGCAATCA - 5700 
-QMS I QQWQYH*LEQQCRQQS 
-KCPSSSGSTTN*SSSVGSNH 
NVHPAVAVPLTRAAV*AA I I 
5701 - TATCATCAGTGAGCAGAGGTGGCAACACTGTAAGTCCATTGAACTTCTGCGCACAAATGA - 57 60 
-YHQ*AEVATL*VH*TSAHK* 
-IISEQRWQHCKSIELLRTNE 
SSVSRGGNTVSPLNFCAQMR 
57 61 - GATCTCTAGCATTAATATCACCTAGGCATTCGCCATATTGCTTCATGAAGCCAGCATCAG - 5820 

- D L * H*YHLGIRHIAS * SQHQ 
-ISSINIT*AFAILLHEASIS 

SLAL I SPRHSPYCFMKPASA 
5821 - CGAGTGTCACCTTATTAAAGAGCAAGTCCTCAATAAAAGACCTCTTAGTTGGCTTTAGAG - 5880 
-RVSPY*RASPQ*KTS*LALE 
-ECHLIKEQVLNKRPLSWL*R 

SVTLLKSKSSIKDLLVGFRG 
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5 881 - GGTCAGGTAATATTTGTGAAftAATTAAAACCACCAAAATATTTCAAAGTTGGGGTTTTGT - 5940 
-GQVIFVKN*NHQNISKLGFC 

- V R * Y L * KIKTTKIFQSWGFV 

SGNICEKLKPPKYFKVGVLY 
5941 - ACATTTGTTTGACTTGAGCGAACACTTCACGTGTGTTGCGATCCTGTTCAGCAGCAATAC - 6000 
-TFV*LERTLHVCCDPVQQQY 
-HLFDLSEHFTCVAILFSSNT 
ICLT*ANTSRVLRSCSAAIP 
6001 - CTGAGAGTGCACGATTTAGTTGTGTGCAAAAGCTACCATATTGGAGAAGCAAATTAGCAC - 6060 
-LRVHDLVVCKSYHIGEAN*H 
-*ECTI*LCAKATILEKQI ST 
ESARFSCVQKLPYWRS K L A H 
6061 - ATTCAGTAGAATCTCCGCAGATGTACATATTACAATCTACGGAGGTTTTAGCCATAGAAA - 6120 

- I Q * NLRRCTYYNLRRF* P * K 
-FSRISADVHITIYGGFSHRN 

SVESPQMYILQSTEVLAIET 
6121 - CAGGCATTACTTCTGTAGTAATGCTAATTGAAAAGTTAGTAGGTATAGCAATGGTGTTAT - 618 0 
-QALLL**C*LKS**V*QWCY 
-RHYFCSNAN*KVSRYSNGVI 
GITSVVMLIEKLVGIAMVLL 
6181 - TAGAGTAAGCAATTGAACTATCAGCACCTAAAGACATAGTATAAGCCACAATAGATTTTT - 6240 

- * SKQLNYQHLKT*YKPQ * IF 
-RVSN*TIST*RHSISHNRFL 

E*AIELSAPKDIV*ATIDFW 
6241 - GGCTAGTACTACGTAATAAAGAAACTGTATGGTAACTAGCACAAATGCCAGCTCCAATAG - 6300 
-G*YYVIKKLYGN*HKCQLQ* 
-ASTT**RNCMVTSTNASSNR 
LVLRNKETVW*LAQMPAPIG 
6301 - GAATGTCGCACTCATAAGAAGTGTCGACATGCTCAGCTCCTATAAGACAGCCTGCTTGAG - 6360 
-ECRTHKKCRHAQLL* DSLLE 
-NVALIRSVDMLSSYKTACLS 
MSHS*EVSTCSAPIRQPA*V 

63 61 - TCTGGAATACATTGTTTCCAGTAGAATATATGCGCCAAGCTGGTGTGAGTTGATCTGCAT - 642 0 

-SGIHCFQ*NICAKLV*VDLH 

- LEYIVSSRIYAPSWCELICM 

WNTLFPVEYMRQAGVS*SA* 
6421 - GAATTGCTGTAGAAACATCAGTGCAGTTAACATCTTGATATAGAACAGCAACTTCAGATG - 6480 
-ELL * K H Q C S * HLDIEQQLQM 
-NCCRNI SAVNILI * N S N F R * 
IAVETSVQLTS* YRTATSDE 

64 81 - AAGCATTTGTTCCAGGTGTAATTACACTTACACGCCCAAAAGAGCAAGGTGAAATGTCTA - 6540 

- K H L FQV*LHIiHPQKSKVKCL 
-SICSRCNYTYTPKRAR*NV* 

AFVPGVITLTPPKEQGEMSN 
6541 - ATATTTCAGATGTTTTAGGATCTCGAACGGT^ATCAGTGAAATCAGAAACATCACGGCCAA - 660 0 
-I FQMF* DLERNQ*NQKHHGQ 
-YFRCFRISNGI SEIRN ITAK 
ISDVLGSRTESVKSETSRPN 
6601 - ATTGTTGAAATGGTTGAAATCTCTTTGAAGAAGGAGTTAACACACCAGTACCAGTGAGTC - 6660 
-IVEMVEI SLKKELTHQYQ*V 
-LLKWLKSL*RRS*HTSTSES 
C*NG*NLFEEGVNTPVPVSP 
6661 - CATTAAAATTAAAATTGACACACTGGTTCTTAATAAGGTCAGTGGATAATTTTGGTCCAC - 6720 

- H * N *N*HTGS* * G Q W. I I LVH 
-IKIKIDTLVLNKVSG* FWST 

LKLKLTHW FLIRSVDNFGPQ 
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6723. - AAACCGTGGCCGGTGCATTTAAAAGTTCAAAAGAAAGTACTACAACTGTGTAAGGTTGGT - 6780 
-KPWPVHLKVQKKVLQLCKV-G 

- N R G R C I*KFKRKYYNSVRLV 

TVAGAFKSSKESTTTL*GW* 
6781 - AGCCAATGCCAGTAGTGGTGTAAAAACCATAATCATTTAATGGCCAATAACAATTAAGAG - 68 4 0 

- S Q C Q * WCKNHNHLMANNN*E 
-ANASSGVKTIII*WPITIKS 

PMPVVV*KP*SFNGQ*QLRA 
6841 - CAGGTGGGGTGCAAGGTTTGCCATCAGGGGAGAAAGGCACATTAGATATGTCTCTCTCAA - 6900 
-QVGCKVCHQGRKAH* I CLSQ 
-RWGARFAI RGERH IRYVSLK 
GGVQGLPSGEKGTLDMSLSK 
6901 - AGGGCCTAAGCTTGCCATGTCTAAGATACCTATATTTATAATTATAAT TACCAGTTGAAG - 6960 
-RA*ACHV* DTYIYNYNYQLK 
-GPKLAMSKIPIFIIIITS*S 
GLSLPCLRYLYL*L*LPVEV 
6961 - TAGCATCAATGTTCCTAGTATTCCAAGCAAGGACACAACCCATGAAATCATCTGGCAATT - 7020 
-*HQCS*YSKQGHNP*NHLAI 
-SINVPSIPSKDTTHEIIWQF 
ASMFLVFQARTQPMKS SGNL 
7 021 - TATAATTATAATCAGCAATAACACCAGTTTGTCCTGGCGCTATTTGTCTTACATCATCTC - 7080 
-YNYNQQ*BQFVLALFVLHHL 
-IIIISNNTSLSWRYLSYIIS 
*L*SAITPVCPGAICLTSSP 
7081 - CCTTGACTACAAAAGAATCTGCATAGACATTGGAGAAGCAAAGATCATTCAACTTAGTGG - 7140 
-P*LQKNLHRHWRSKDHST*W 
-LDYKRICI DIGEAKI IQ LSG 
LTTKESA* TLEKQRSFNLVA 
7141 - CAGAAACGCCATAGCACTTAAAGGTTGAAAAAAATGTTGAGTTGTAGAGCACAGAGTAAT - 7200 
-QKRHS T*RIiKKMLS CRAQSN 
-RNAIALKG*KKC*VVEHRVI 
ETP*HLKVEKNVEL*STE*S 
7201 - CAGCAACACAATTAGAAATTTTTTTTCTCTCCCATGCATAGACAGAAGGGAATTTAGTAG - 7260 

- Q Q H N * KFFFSPMHRQKG I* * 

- SNTIRNFFSLPCI DRREFSS 

ATQLEIFFLSHA*TEGNLVA 
72 61 - CATTAAAAACCTCTCCAAAAGGACACAAGTTTGTAATATTAGGGAATCTCACAACATCTC - 7320 
-H*KPLQKDTSL* Y*GISQHL 

- IKNLSKRTQVCNIRESHNIS 

LKTSPKGHKFVILGNLTTSP 
7321 - CTGAGGGAACAACCCTGAAATTAGAGGTCTGGTAAATTCCTTTGTCAATCTCAAAGCTCT - 7380 
-LREQP*N*RSGKFLCQSQSS 
* GNNPEIRGLVNS FVNLKAL 
EGTTLKliEVW* I PLSISKLL 
7381 TAACAGAGCATTTGAGTTCAGCAAGTGGATTTTGAGAACAATCAACAGCATCTGTGATTG - 7440 
-*QSI*VQQVDFENNQQHL*L 
-NRAFEFSKWILRTINS ICDC 
TEHLSSASGF*EQSTASVIV 
7 441 - TACCATTTTCATCATACTTGAGCATAAATGTAGTTGGCTTTAAATAGCCAACAAAATAGG - 7500 
-YHFHHT*A*M*LALNSQQNR 
-TIFIILEHKCSWL*IANKIG 
PFSSYLSINVVGFK*PTK*A 
7 501 - CTGCAGCTGACGTGCCCCAAATGTCTTGAGCAGGTGAAAAGGCTGTAAGAATGGCTCTAA - 7 560 
-LQLTCPKCLEQVKRL * E W L * 

- CS *RAPNVLSR*KGCKNGSK 

AADVPQMS *AGEKAVRMALK 
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7561 - AATTTGTAATGTTAATACCAAGAGGCAACTTAAAAATAGGTTTCAAAGTGTTAAAACCAG - 7 620 
-NL*C*YQEAT*K*VSKC*NQ 
I CNVNTKRQLKNRFQSVKTR 
FVMLIPRGNLKIGFKVLKPE 
7 621 - AAGGTAGATCACGAACTACATCTATAGGTTGATAGCCCTTATAAACATAGAGAAACCCAT - 7 680 

- K V D H E L H L * VDS PYKHRETH 

- R * ITNYIYRLIALINIEKPI 

GRSRTTSIG**PL*T*RNPS 
7681 - CTTTATTTTTAAACACAAACTCTCGTAAGTGTTTAAAATTACCTGACTTTTGTGAAACAT - 7740 

- L Y F * TQTLVSV* NYLTFLKH 

- F I FKHKLS * V F K I T * L F * N I 

LFLNTNSRKCLKLPDFSETS 
7741 - CAAGCGAAAAGGCATCAGATATGTACTCGAAAGTGGAATTAAATGCATTATCGAATATCA - 7800 
-QAKRHQI CTRKCN*MHYRIS 
-KRKGIRYVLESAIKCIIEYH 
SEKASDMYSKVQLNALSNII 
7801 - TAGTA^GTGTCTGTGTACCCATGGGTTTAGAAACAGCAAAGAAAGGGTTGTCACACAATT - 7860 
-*YVSVYPWV*KQQRKGCHTI 
-SMCLCTHGFRNSKERVVTQF 
VCVCVPMGLETAKKGLSHNS 
7861 - CAAAGTTACATGCTCGTATAACAACATTAGTAGAATTGTTAATAATAATCACCGACTGTG - 7920 
-QSYMLV*QH**NC***SPTV 
-KVTCSYNNISRIVNNNHRL* 
KLHARITTLVELLI I ITDCD 
7 921 - ACTTGTTGTTCATGGTAGAACCAAAAACCCAACCACGGACAACATTTGATTTCTCTGTGG - 7 980 
-TCCSW*NQKPNHGQHLISLW 
-LVVHGRTKNPTTDNI * FLCG 
LLFMVEPKTQPRTTFDFSVA 
7 981 - CAGCAAAATAAATACCATCCTTAAAAGGTATGACAGGGTTGCCAAACGTATGATTAATAG - 8 04 0 
-QQNKYHP*KV*QGCQTYD** 
-SKINTILKRYDRVAKRMINS 
AK*IPSLKGMTGLPNV*LIV 
8041 - TATGAAACCCTGTAACATTAGAATAAAATGGAAGAAATAAATCCTGAGTTAAATAAAGAG - 8100 
-YETL*H*NKMEEINPELNKE 
-MKPCN-IRIKWKK* I L S * I K S 
*NPVTLE*NGRNKS*VK*RV 
8101 - TGTCTGATCTAAAAATTTCATCAGGATAGTAAACCCCCCTCATAGATGAAGTATGTTGAG - 8160 
-CL I * KFHQDSKPPS*MKYVE 
-V* SKNFIRIVNPPHR*SMLS 
SDLKISSG**TPLIDEVC*V 
8161 - TGTAATTAGGAGCTTGAACATCATCAAAAGTGGTGGACCGGTCAAGGTCACTACCACTAG - 822 0 
-CN*ELEHHQKWCTGQGHYH* 
-VIRSLNIIKSGAPVKVTTTS 
*LGA*TSSKVVHRSRSLPLV 
8221 - TGAGAGTAAGAAATAATAAGAAAATAAACATGTTCGTTTAGTTGTTAACAAGAATATCAC - 828 0 
-*E*EIIRK*TCSFSC*QEYH 
~ESKK**ENKHVRLVVNKNIT 
RVRNNKKINMFV* LLTRISL 
8281 - TTGAAACCACAACTCTGTTGTTTTCTCTAATGATAAGCCTACCTTTTTCCAGAAGAGAAT - 834 0 
-LKPQLCCFL* * *AYLFPEEN 
-*NHNSVVFSNDKPTFFQKRI 
ETTTLLFSLMISLPFSRRE* 
8341 - AAATCATATCATTGATTTGATTCTCCTTAAGAGACATTACAGCAGTTCCTCTTAATTTAA - 8 40 0 

- K S Y H * F D S P * ETLQQFLLI* 
-NHIIDLILLKRHYSSSS* FK 

IISLI*FSLRDITAVPLNLR 
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8401 - GAGGAAATTTGCTCATGTCAAAGAGTGAATAGGAAGAGAACTGGATAGGATTTGTGTTCC - 84 60 
-BE ICSCQRVNRKTTG * DLCS 

- RKFAHVKE* IGRQLDRICVP 

GNLLMSKSE*EDNWIGFVFL 
84 61 - TCCAGAAAATGTAGTTAGCATGCATGGTATAGCCATCAATTTGTTCCTTCGGCTTGCCAA - 8520 
-SRKCS*HAWYSHQFVPSACQ 
-PENVVSMHGIAINLFLRLAK 
QKM*LACMV*PSICSFGLPR 
8521 - GATAGTTAGCCCCAATTAAAAATGCTTCCGATGATGATGCATTTACATTTGTAACAAAAG - 85 80 

- D S * P QLKMLPMMMHLHL* Q K 
-IVSPN*KCFR**CIYICNKS 

*LAPIKNASDDDAFTFVTKA 
8581 - CTGTCCACCATGAGAAATGGCCCATAAGCTTGTAAAGGTCAGCATTCCAAGAATGCTCTG - 8640 

- L S TMRNGP *ACKGQHS KNAL 
-CPP*EMARKLVKVSIPRMLC 

VHHEKWPISL*RSAFQECSV 
8 641 - TTATCTTTACAGCTATAGAACCACCCAGGGCTAGTTTTTGCTTTATAAATCCACACAGAT - 87 00 
-LSLQL*NHPGLVFAL* IHTD 
-YLYSYRTTQG*FLLYKSTQ1 
IFTAIEPPRASFCFINPHR* 
8701 - AAGTGAAAAAGCCTTCTTTAGAGTCATTCTCTTTTGTCACATGTTTGGTCCTAGGGTCAT - 87 60 

- K * KTLL*SHSLLSHVWS * GH 

SEKPFFRVILFCHMFGPRVI 
VKNPSLESFSFVTCLVLGSY 
8761 - ACATATCGCTAATAATAAGGTCCCATTTATTAGCCGTATGTACTGTTGCACAGTCTCCAA - 8820 

- T Y R * **GPIY*PYVLLHSLQ 

H I AN NKVP F I SRMYCCTVSN 
ISLIIRSHLLAVCTVAQSPI 
8821 - TTAAAGTAGAATCTGCGTCGGAGACGAAGTCATTAAGATCTGAATCGACAAGTAGTGTGC - 8880 

- L K * N LRRRRSH* DLNRQVVC 
~*SRICVGDEVIKI*IDK*CA 

KVESASETKSLRSESTSSVP 
8 881 - CAGTTGGCAACCATTGTCTGAGCACAGCTGTACCTGGTGCAACTCCTTTATCAGAGCCAG - 89 4 0 
-QLAT IV*AQLYLVQLLYQSQ 
-SWQPLSEHSCTWCNSFIRAS 
VGNHCLSTAVPGATPLSEPA 
8 941 - CACCAAAGTGAATAACTCTCATGTTGTAGGGTACAGCTAAAGTAAGTGTATTTAAGTATT - 9000 
-HQSE*LSCCRVQLK*VYLSI 
-TKVNNSHVVGYS*SKCI*VL 
P K * I T L M L * GTAKVSVFKY* 
9001 - GACACAGTTGAGTATACTTTGCGACATTCATCATTATTCGTTTTGGTATAACAGCATTTT - 90 60 
-DTVEYTLRHSSL FLLV*QHF 

- TQLSILCDIHHYSFWYNSIF 

RS*VYFATFIIIPFGITAFS 
90 61 - CACCATAATTCTGAAGGTCACACTTTTCAAGAAGCATTCTTTGCATGTTGTAGAAGTTAG - 9120 
-HHNSEGHTFQEAFFASCTS* 

- T I ILKVTLFKKHSLHLVQVR 

P*F*RSHFSR SILCILYKLG 
9121 - GCATGGCAACACCTGGTTGCCACGGTTGACTTGCTTGTAGTTTTGGGTAGAAGGTTTCAA - 9180 
-ASQHLVATLDLLVVLGRRFQ 
-HRNTWLPRLTCL* FWVEGFN 
IATPGCHA* LACSFG* KVST 
9181 - CATGTCCATCCTTACACCAAAGCATGAATGAAATTTCAGCATAGTCAATTGTAACCTTGA - 924 0 
-HVHPYTKA*MKFQHSQL* P * 
-MS ILTPKHE^NFS IVNCNLD 
CPSLHQSMNEISA*SIVTLT 
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9241 - CCACTTTTGAAATCACTGACAAATCTTGTGACTTTATTATCTCGACAAAGTCATCftAGTA - 9300 
-PLLKSLTNLVTLLSRQSHQV 
-HF*NH*QIL*LYYLDKVIK* 
TFEITDKSCDFIISTKSSSK 

9301 - AAAGATCAATCACAGAACACACACATTTTGATGAACCTGTTTGCGCATCTGTTATGAAGT - 93 60 
-KDQSQNTHILMNLFAHLL*S 
-KINHRTHTF* * TCLRICYEV 
RSITEHTHFDEPVCASVMK* 

9361 - AATTTTTCACTGTGCTGTCCATAGGGATAAAATCCTCTAATTTAAGTGGTGAATCTTGTG - 9420 
-NFSL.CCP*G*NPLI*VVNIiV 

- IFHCAVHRDKIL* F K W * I L * 

FFTVLS IGIKSSNLSGESCE 
9421 - AGCGCTTGGCTAAGCCTATCATTAAATGAAGACCGCCAAGTTGTCCATGACTGAAATCTC - 9480 
-SAWLSLSLNEDRQVVHD*NL 
-ALG*AYH*MKTAKLSMTEIS 
RLAKPIIK*RPPSCP*LKSP 
94 81 - CATAAACGATGTGTTCGAAGGCATAGCCCTCGAGCTTATATCGCTGTATGAATTCATCCA - 95 40 
-HKRCVRRHS PRAYIAV* IHP 
-INDVFEGIALELISLYEFIH 
*TMCSKA*PSSLYRCMN SSI 
9541 - TAGCGAGCTCGAGAAAGTCAGTTTCCATTTGTGATCTGGGCTTAAAATCCTCTAAGTCTC - 9600 
-*RARESQFPFVIWA*NPLSIi 
~SELEKVSFHL*SGLKIL*VS 
ASSRKSVSICDLGLKSSKSL 
9 601 - TGCTCTGAGTAAAGTAGGTTTCAGGCAACTGTTGAATAATGCCGTCTACTTTCTTAAAGT - 9660 

- C S E * SRFQATVE*CRLLS*S 
-ALSKVGFRQLLNNAVYFLKV 

L*VK*VSGNC*IMPSTFLK* 
9661 - AGTTAAACTGTGTTTTTACTGATTCTCCAATTAATGTGACTCCATTGACGCTAGCTTGTG - 9720 
-S*TVFLL ILQLM *I)H*R*LV 
-VKLCFY*FSN*CDSIDASLC 
LNCVFTDS PINVTPLTLACA 
9721 - CTGGTCCCTTTGAAGGTGTTAGACCTTTGACTGAACCTTCTGTTATTAAAACACCATTAC - 9780 
-LVPLKVLDL* LNLLLLKHHY 
-WSL '*RC*TFD*TFCY*NTIT 
GPFEGVRPLTEPSVIKTPLR 
9781 - GGGCGTTTCTAAAAAGGTCTACCTGTCCTTCCACTCTACCATCAAACAAGACAGTAAGTG - 9840 

- G R F * KGLPVLPLYHQTRQ*V 
-GVSKKVYLSFHSTIKQDSK* 

AFLKRSTCPSTLPSNKTVSE 
98 41 - AAGAACAAGCACTCTCAGTAGGTTTCTTGGCAATGTCAGTCATTGTGCAGACACCTATTG - 9900 
-KNKHSQ* VSWQCQSLCRHLL 
RTSTLSRFLGNVSHCADTYC 
EQALSVGFLAMSVIVQTPIV 
9901 - TAGATACATGTGCTGGGGCTTCTCTTTTGTAGTCCCAGATTACAGTATTAGCAGCGATAT - 9960 

- * IHVLGLLFCSPRLQY* QRY 
-RYMCWGFSFVVPDYSI-SSDI 

DTCAGASLL*SQITVLAAIS 
9 961 - CAACACCCAAATTATTGAGTATCTTAATCTCTGGCACTGGTTTAATGTTACGCTTAGCCC - 10020 
-QHPNY*VS*SLALV*CYA*P 

- N T Q I IEYLNLWHWFNVTLSP 

TPKLLS ILISGTGLMLRLAQ 
10021 - AAAGCTCAAATGCAACATTAACAGGAAGTGTTGTCTTATTTTCAAAGATCTCCACATCAA - 10080 
-KAQMQH * QEVLS YFQRS PHQ 

- KLKCNINRKCCLIFKDLHIN 

SSNATLTGSVVLFSKISTSI 
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10081 - TACCATCTACCTTTGTGTAAACAGCATTATTAATGATGGAAACAGGTGCTTCGCCGGCGT - 10140 
-YHLP LCKQHY* * WKQV LRRR 
-TIYLCVNS IINDGNRCFAGV 
PSTFV*TALLMMETGASPAC 
10141 - GTCCATCAAAGTGTCGTTTATTAACAACATTATAAGCCACATTTTCTAAACTCTGTAACC - 10200 
-VHQSVLY*QHYKPHFLNSVT 
-.SIKVSF INNIISHIF*TL*P 
PSKCPLLTTIi*ATFSKLCNL 
10201 - TGGTAAATGTATTCCACAGGTTATAAGTATCAAATTGTTTGTAAATCCATAGGCTAAATC - 10260 

- W * M Y STGYKYQIVCKS I G * I 
-GKCIPQVISIKLPVNP*AKS 

VNVFHRL*VSNCL*IHRLNP 
102 61 - CAGCAGAAATCATCATATTATATGCATCCAAGTACTGTCGGTACTCATTTGCATGGTGTG - 10320 
-QQKS SYYMHPSTVGTHLHGV 
-SRNHHIICIQVLSVLICMVS 
AEII ILYASKYCRYSFAWCL 
10321 - TGCAAACAGCACCACCTAAATTGCATCGTGTAATACACGTAGCAGATTTGAGTGGAACAT - 10380 
-CKQHHLNCIV*YT*QI * V E K 

- A N S T T * IASCNTRSRFEWNI 

QTAPPKLHRVIHVADLSGT* 
10381 - AATCAATATCCGACACTACTTGTTTGCCATGAGACTCACAAGGACTATCAGAATAGTAAA - 10440 
-NQYPTLLVCHETHKDYQNSK 
-INIRHYLFAMRLTRTI RIVK 
SISDTTCLP*DSQGLSE**K 
10441 - AGAAAGGCAATTGCTTTAAATTAGTAAATGCACTTTTATCGAAAGCTGGAGTGTGGAATG - 10500 
-RKAIALN * *MHFYRKLECGM 
-ERQLL*ISKCTFIESWSVEC 
KGNCFKLVNALLSKAGVWNA 
10501 - CATGCTTATTCACATACAAACTACCACCATCACAGCCTGGTAAGTTCAAGTTTGACAAGA - 105 60 
-HAYS H TNYHHHSLV S S SLTR 
~MLIHIQTTTITAW*VQV*QD 
CLFTYKLPPSQPGKFKFDKT 
10561 - CTCTTGTGTCAAACCTACACACAATTGCATTGGCTGGGTAACGATCAACGTTACAATTCC - 10620 
-LLCQTYTQLHWLGNDQRYNS 

- SCVKPTHNCIGWVTINVTIP 

LVSNLHTIALAG* RSTLQFQ 
10621 - AAAACAAACAAACACCATCAGTGAATTTATCGTGATGTGTAGCATAAGAATAGAAGAGTT - 10680 
-KTNKH HQ* IYRDV*HKNRRV 
-KQTNTISEFIVMCSIRIEEF 
NKQTPSVNLS*CVA*E*KSS 
10681 - CCTCTATTTTGTAAGCTTTGTCACTACATGGCTGAGCATCGTAGAACTTCCATTCTACTT - 10740 
-PLFCKLCHYMAEHRRTS ILL 

- LY FV S FVT TWLS I VEL-P FYF 

SIL*ALSLHG*AS*NFHSTS 
10741 - CAGCCTGAGGCACACACTTGATAGCCTTTGGATTTCCAATGTCATGAAGAACTGGAAACT - 10800 
-QPEAHT** PLDFQCHEELET 
-SLRH TLDSLW.ISNVMKNWKL 
A*GTHLIAFGFPMS*RTGNL 
10801 - TATCAGCAAGCAATGCAGACTTCACAACCATGTGTTGTACTTTTCTGCAAGCAGAATTAA - 108 60 
-YQQAMQTS QPCVVLFCKQN* 
-ISKQCRLHNHVLYFSASRIN 
SASNADFTTMCCT FLQAELT 
10861 - CCCTCAGTTCATCTCCTATAATAGGGTATTCAACAGAGCAATGAACGCGCTTAACAAAGC - 10920 
-PSVHLL* * GIQQTNQRA* QS 

- P Q F I S YNRVFNRP INALNKA 

LSSSP IIGYSTDQSTRLTKH. 
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•10921 - ACTCATGGACTGCTAAACATCTAGTCATGATAGCATCACAACTAGCCACATGTGCATTTC - 1098 0 
-THGLLNI * S**HHN*PHVHF 

- LMDC*TSSHDSITTSHMCIS 

SWTAKHLVMIASQLATCAFP 
10981 - CATGTACCTGGCAATGTTGGTCATGGTTACTCTGAAGGTTACCCGTAAAGCCCCACTGCT - 1104 0 
-HVPGNVGHGYSEGYP* SPTA 
-MYLAMLVMVTLKVTRKAPLL 
CTWQCWSWLL*RLPVKPHC* 
11041 - GAACATCAATCATAAATGGGTTATAGACATAGTCAAAACCCACAGAATGATTCCAGCAGG - 11100 
-EHQS *MGYRHSQNPQNDSSR 

- N INHKWV I DIVKTHRMIPAG 

TSIINGL*T*SKPTE*FQQA 
11101 - CATAAGTATCTGATGAAGTAGAAAAGCAAGTTGCACGTTTGTCACACAGACAACACGTTC - 11160 
-HKYLMK* KSKLHVCHTDNTF 
-ISI**SRKASCTFVTQTTRS 
*VSDEVEKQVARLSHRQHVL 
11161 - TTTCAGGTCCAATCTTGACAAAGTACTTCATTGATGTAAGCTCAAAGCCATGCGCCCAAA - 11220 
-FQVQS*QSTSLM*A.QSHAPK 

- FRSNLDKVLH* CKLKAMRPK 

SGPILTKYFIDVSSKPCAQR 
11221 - GGACGAACACGACTCTGTCTGACAATCCTTTCAGTGTATCACTGAGCATTTGTACTATCT - 11280 
-GRTR3jCLTILSVYH*'AFVLS 
DEHDSV*QSFQCITEHLYYL 
TNTTLSDNPFSVSLSICTIL 
11281 - TAATACGCACTACATTCCAGGGCAAGCCTTTATACATGAGTGGTATAAGATGTTTAAACT - 11340 
-*YALHSRASLYT*VV* d v * t 
-NTHYIPGQAFIHEWYKMFKL 
IRTTFQGKPLYMSGIRCLNW 
11341 - GGTCACCTGGTGGAGGTTTTGCATTAACTCTGGTGAATTCTGTGTTATTTTCAGTGTCAA - 11400 
-GHLVEVLH* LW * ILCYFQCQ 
-VTWWRFCINSGEFCVIFSVN 
SPGGGFALTLVNSVLFSVST 
11401 - CATAACCAGTCGGTACAGCTACTAAGTTAACACCTGTAGAAAATCCTAGCTGGAGAGGTA - 114 60 
-HNQSVQLLS *HL*KILAGEV 
-ITSRYSY*VNTCRKS*LER* 
*PVGTATKLTPVENPSWRGR 
114 61 - GGTTAGTACCCACAGCATCTCTAGTTGCATGACAGCCCTCTACATCAAAGCCAATCCACG - 11520 
-G*YPQHL*LHDSPLHQSQST 
-VSTHSISSCMTALYIKANPR 
LVPTASL, VA*QPSTSKPIHA 
11521 - CACGAACGTGACGAATAGCTTCTTCGCGGGTGATAAACATATTAGGGTAACCATTGACTT - 11580 
-HERDE±LLRG**TY*GNH*L 
-TNVTNSFFAGDKHIRVTI DL 
RT*RIASSRVINILG*PLTW 

11581 - ggtaattcattttgaaacccatcatagagatgagtctacggtaggtcatgtcctttggta - 11640 
-gnsf*nps*r*vygrscp:lv 
-vihfethhrdestvghvlwy 
*filkpiiemslr*vmsfgm 
11641 - tgcctggtatgtcaacacataatccttcagtcttgaattttatatcaacgctgaggtgtg - 11700 
-clvcqhiilqs*ilyqr*gv 
-awyvnt* sfslefyinaevc 
pg ms thnpsvlnfi stlrcv 
11701 - taggtgcctgtgtaggatgaagaccagtaatgatcttactacagtccttaaaaagtccag - 11760 
-*vpv*dedq**syysp*kvq 
-rclcrmktsndlttvlkkss 
gacvg*rpvmil,lqslkspv 
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11761 - TTACATTTTCTGCTTGTAATGTAGCCACATTGGGACGTGGTATTTCTAGACTTGTAAATT - 11820 
-LHFLLVM*PHCDVVFLDL * I 
-YIFCL*CSHIATWYF*TCKL 
TFSACNVATLRRGISRLVttC 

11821 - GCAGTTTGTCATAAAGATCTCTATCAGACATTATGCACAAAATGCCAATTTTTGCCCTTG - 11880 
-AVCHKDLYQTLCTKCQFLPL 

- QFVIKI SIRHYAQNANFCPC 

SLS*RSLS DIMHKMPI FALV 
11881. - TGATAGCCACATTGAAGCGGTTGACATTACAAGAGTGTGCTGTTTCAGTAGTTTGTGTGA - 11940 
-**PH*SG*HYKSVLFQ*FV* 
-DSHIEAVDITRVCCFSSLCE 
IATLKRLTLQECAVSVVCVN 
11941 - ATATGACATAGTCATATTCAGAACGCTGTGATG7^ATCAACAGTCTGCGTAGGCAATCCTA - 12000 
-I *HSHIQNPVMNQQSA*AIL 
-YDIVIFRTL** INSLRRQS* 
MT*SYSEPCDESTVCVGNPK 
12001 - AGATTTTTGAAGCTACAGCGTTCTGTGAATTATAAGGTGAGATAAAAACAGCTTTTCTCC - 12060 
-RFLKLQRSVNYKVR* KQL FS 
-DF*SYSVL*IIR*DKNSFSP 
IFEATAFCEL*GEI KTAFLQ 
12061 - AAGCAGGATTGCGTGTAAGAAATTCTCTTACAACGCCTATTTGAGGTCTGTTGATTGCAG - 12120 
-KQDCV*EILLQRLFEVC*LQ 
-SRIACKKFSYNAYLRSVDCR 
AGLRVRNSLTTPI* GLLIAD 
12121 - ATGAAACATCATGTGTAATAACACCTTTGTAGAACATTTTGAAGCATTGAGCTGACTTAT - 12180 

- M K H H V * *HLCRTF* SIELTY 

* NIMCNNTFVEHFEALS*LI 
ETSCVITPL*NILKH*ADLS 
12181 - CCTTGTGTGCTTTTAGCTTATTGTCATAAACTAAAGCACTCACAGTGTCAACAATTTCAG - 12240 
-PCVLLAYCHKLKHSQCQQFQ 
-LVCF*LIVIN*STHSVNNFS 
LCAFSLLS*TKALTVSTISA 
12241 - CAGGACAACGGCGACAAGTTCCAAGGAACATGTCTGGACCTATTGTTTTCATAAGTCTGC - 12300 

- Q DNGDKFQGTCL DLL FS * VC 

- R T T A T S SKEHVWTYCFHKSA 

GQRRQVPRNMSGPIVFI SLH 
12301 - ACACTGAATTAAAATATTCTGGTTCTAGTGTGCCTTTAGTCAGCAATGTGCGGGGGGCTG - 12360 

- T LN *NILVLVCL* SAMCGGL 

- H * IKIFWF*CAFSQQCAGGW 

TELKYSGS SVPLVSNVRGAG 
12361 - GTAATTGAGCAGGATCGCCAATATAGACGTAGTGTTTTGCACGAAGTCTAGCATTGACAA - 12420 
-V IEQDRQYRRSVLHEV* H * Q 
*LSRIANIDVVFCTKSSI DM 
N*AGSPI*T*CFARSLALTT 
12421 - CACTCAAGTCATAATTAGTAGCCATAGAGATTTCATCAAAGACTACAATGTCAGCAGTTG - 12480 
-HSSHN**P*RFHQRLQCQQL 
-TQVIISSHRDFIKDYNVSSC 
LKS*LVAIEISSKTTMSAVV 
12481 - TTTCTGGCAATGCATTTACAGTGCAGAAAACATACTGTTCTAGTGTTGAATTCACTTTGA - 12540 
-FLAMHLQCRKHTVLVLNSL* 
FWQCIYSAENILF*C* IHFE 
SGNAFTVQKTYCSSVEFTLN 
12541 - ATTTATCAAAACACTCTACGCGCGCACGCGCAGGTATGATTCTACTACATTTATCTATGG - 12600 

- I YQNTLRAHAQV* FYYIYLW 

FIKTLYARTRRYDSTTFIYG 
LSKHSTRARAGMILLRLSMG 
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12601 ~ GCMATATTTTAATGCCTTTTCACATAGGGCATCAACAGCTGCATGAGAGCATGCCGTAT - 12 660 
-AN I LMPFH3GHQQL HE SM PY 
-QIF*CLFT*GINSCMRACRI 
KYFNAFSHRASTAA*EHAVY 
12661 - ACACTATGCGAGCAGATGGGTAATAGAGAGCAAGTCCGATGGCAAAATGACTCTTACCAG - 12720 
-TLCEQMGNREQVRWQNDS YQ 
-HYASRWVIESKSDGKMTLTS 
T M R A D G * * R A S P M A K * LLPV 
12721 - TACCAGGTGGTCCTTGGAGTGTAGAGTACTTTTGCATGCCGACCTTTTGATAATTTGCAA - 12780 
-YQVVLGV*STFACRPFDNLQ 
-TRWSLECRVLLHADLLIICN 
PGGPWSVEYFCMPTF* * F A T 
12781 - CATTGCTAGAAAACTCATCTGAGATGTTGAGTGTTGGGTACAAGCCAGTAATTCTCACAT - 12840 
-HC*KTHLRC*VLGTSQ*FSH 
-IARKLI*DVECWVQASNSHI 
LLENSSEMLSVGYKPV ILT* 
12841 - AGTGCTCTTGTGGCACTAGAGTAGGTGCACTAAGTGGCATTACAGTGTGAGATGTCAACA - 12900 
-SALVALE *VH*VALQCEM ST 
-VLLWH*SRCTKWHYSVRCQH 
CSCGTRVGALSGITV* DVNT 
12 901 - CAAAGTAATCACCAACATTCAACTTGTATGTCGTAGTACCTCTGTACACAACAGCATCAC - 12 960 
-QSNHQHSTCMS * YLCTQQHH 
-KVITNIQLVCRSTSVHNSIT 
K * SPTFNLYVVVPLYT TASP 
12 961 - CATAGTCACCTTTTTCAAAGGTGTACTCTCCAATCTGTACTTTACTATTTTTAGTTACAC - 13020 
-HSHL FQRCTLQSVLYYF* LH 

- IVTFFKGVLSNLYFTI FSYT 

* S P FSKVYSP IC TLLF LVTR 
13021 - GGTAACCAGTAAAGACATAGTTTCTGTTCAATGGTGGTCTAGGTTTTCCAACCTCCCATG - 13080 
-GNQ*RHSFCSMVV*VFQPPM 
-VTSKDIVSVQWWSRFSNLP* 
*PVKT*FLFNGGLGFP TSHE 
13081 - AAAGATGCAATTCTCTGTCAGAGAGTACTTCGCGTACAGTGGCAATACCATATGACAGCT - 13140 
-KDAI LCQRVLRVQWQY R M TA 
-KMQFSVREYFAYSGNTI* QL 
RCNSLSESTSRTVAI PYDSL 
13141 - TAAATGTTTCCTCAGTGGCTTTGAGCGTTTCTGCTGCGAAAAGCTTGAGTCTCTCAGTAG - 13200 

- * MFPQWL*AFLLRKA*VSQY 
-KCFLSGFERFCCEKLESLST 

NVSSVALSVSAAKSLSLSVQ 
13201 - AAGTGTTGGCAAGTATGTAATCGCCAGCATTAGTCCAATCACATGTTGCTATCGCATTGA - 13260 
-KCWQVCNRQH* SNHMLLSH* 
-SVGKYVIASISPITCCYRIE 
VLASM^SPALVQSHVAIALK 
13261 - AGTCAGTGACATTGTCACTGCCTACACATGTGTTTTTGTATAAACCAAAAACCTGACCAT - 13320 

- S Q * HCIiCLHMCFCINQKPDH 
-VSDIVTAYTCVFV*TKNLTI 

SVTLSLPTHVFLYKPKT*PL 
13321 - TAGCACATAATGGAAAACTAATGGGAGGCTTATGTGACTTGCAATAATAGCTCATACCTC - 13380 
-*HIMEN*WEAYVTCNNSSYL 
-ST*WKTNGRLM*"LAI IAHTS 
AHNGKLMGGLCDLQ* * LIPP 
13381 - CTAGATACAGTTGTGTCACATCAGTGACATCACAACCTGGGGCATTGCAAACATAGGGAT - 13440 
-LDTVVSHQ* HHNLGHCKHRD 

- * IQLCHIS DITTWGIANIGI 

RYSCVTSVTSQPGALQT*GL 
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13441 - TAACAGACAACACTAATTTGTGTGATGTTGAAATGACATGGTCATAGCAGCACTTGCAAC - 13500 
-*QTTLICVMLK*HGHSSTCN 
-NRQH*FV*C*NDMVIAALAT 
TDNTNLCDVEMTWS * QHLQH 
13501 - ATAGGAATGGTCTCCTAATACAGGCACCGCAACGAAGTGAAGTCTGTGAATTGCACAATA - 13560 
-I GMVS *YRHRNEVKSVNCTI 
-*EWSPNTGTATK*SL* IAQY 
RNGLLIQAPQRSEVCELHNT 
13561 - CACAAGCACCTACAGCCTGCAAGACTGTATGTGGTGTGTACATAGCCTCATAAAACTCAG - 13620 
-HKHLQPARLYVVCT* PHKTQ 
-TSTYSLQDCMWCVHSLIKLR 
QAPTACKTVCGVYIAS * N S G 
13621 - GTTCCCAGTACCGTGAGGTGTTATCATTAGTTAGCATTACGGAATACATGTCCAACATGT - 13680 
-VPSTVRCYH*LALRNTCPTC 
~FPVP*GVIIS*HYGIHVQHV 
SQYREVLSLVSI TEYMSNMW 
13 681 - GGCCAGTAAGCTCATCATGTAACTTTCTAATGTATTGTAAATACAAGTGAAAGACATCAG - 13740 
-GQ*AHHVTF* CIVNTSERHQ 
-ASKLIM*LSNVL*IQVKDIS 
PVSSSCNFLMYCKYK* KTSA 
13741 - CATACTCCTGATTAGGATGTTTXGTAAGTGGGTAAGCATCAATAGCCAGTGACACGAACC - 13800 

- H T P D * DVL*VGKHQ* PVTRT 
-ILLIRMFCKWVSINSQ*HEP 

YS*LGCFVSG*ASIASDTNL 
13801 - TTTCAATCATAAGTGTACCATCTGTTTTGACAATATCATCGACAAAACAGCCTGCGCCTA - 13860 

- F Q S * V Y H L F * QYHRQNSLRL 
-FNHKCTICFDNI I D K T A C A * 

SIISVPSVLTISSTKQPAPN 
13 8 61 - ATATTCTTGATGGATCTGGGTAAGGCAGGTACACGTAATCATCTCCTTGTTTAACTAGCA - 13920 
-I FLMDLGKAGTRNHLLV*LA 

- YS*WIWVRQVHVIISLFN*H 

ILDGSG*GRYT*SSPCLTSI 
13921 - TTGTATGCTGTGAGCAAAATTCGTGAGGTCCTTTAGTAAGGTCAGTCTCAGTCCAACATT - 13980 
-LYAVSKIREVL* *GQSQSNI 
-CML*AKFVRSFSKVSLSPTF 
VCCEQNS*GPLVRSVSVQHF 
13981 - TTGCCTCAGACATGAACACATTATTTTGATAATAAAGAACTGCCTTAAAGTTCTTAATGC - 14040 
-LPQT * THYFDNKELP * SS * C 
-CIjRHEHI iliiknclkvlna 

ASDMNTLF***RTALKFLML 
14041 - TAGCTACTAAACCTTGAGCCGCATAGTTACTGTTATAGCACACAACGGCATCATCAGAAA - 14100 

- * LLNLEPHSYCYSTQRHHQK 

- SY*TLSRIVTVIAHNGIIRK 

ATKP*AA*LLL*HTTASSER 
14101 - GAATCATCATGGAGAAATGTTTACGCAGGTAAGCGTAAAACTCATCCACGAATTCATGAT - 14160 

- E SSWRNVYAGKRKTHPKIHD 
-NHHGEMFTQVSVKLIHEFMI 

IIMEKCLRR*A*NSSTNS*S- 
14161 - CAACATCCCTATTTCTATAGAGACACTCATAGAGCCTGTGTTGTAGATTGCGGACATACT - 14220 
-QHPYFYRDTHRACVVDCGHT 
-NIPISIETLIEPVL*IADIL 
TSLFL*RHS*SLCCRLRTYL 
14221 - TGTCAGCTATCTTATTACCATCAGTTGAAAGAAGTGCATTTACATTGGCTGTAACAGCTT - 14280 
-CQLSYYHQLKEVHLHWL* QL 
-VSYLITIS*KKCIYIGCNSL 
SAILLPSVERSAPTIiAVTA* 
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14281 * GACAAATGTTAAAGACACTATTAGCATAAGCAGTTGTAGCATCACCGGATGATGTTCCAC - 14 340 

- D K C * RHY* H K Q L * HHRMMFH 
-TNVKDTISISSCSITG^CST 

QMLKTLLA*AVVASPDDVPP 
14341 - CTGGTTTAACATATAGTGAGCCGCCACACATGACCATCTCACTTAATACTTGCGCACACT - 14400 
~LV*HIVSRHT* PSHLI LAHT 

- W F N I * *AATHDHLT* YLRTL 

GLTYSEPPHMTISLNTCAHS 
14401 - CGTTAGCTAACCTGTAGAAACGGTGTGATAAGTTACAGCAAGTGTTATGTTTGCGAGCAA - 14460 
~ R * LTCRNGVISYSKCYVCEQ 
-VS*PVETV**VTASVMFASK 
L A N L * KRCDKLQQVLC LRAR 
14461 - GAACAAGAGAGGCCATTATCCTAAGCATGTTAGGCATGGCTCTGTCACATTTTGGATAAT - 14520 
-EQERPLS *AC*AWLCH I LDN " 
-NKRGHYPKH-VRHGSVTFWII 
TREAI ILSMLGMALSHFG**S 
14 521 - CCCAACCCATAAGGTGTGGAGTTTCTACATCACTGTAAACAGTTTTTAACATATTATGCC - 14580 
-PNP*GVEFLHHCKQFLTYYA 
-PTH'KVWSFYITVNSF*HIMP 
QPIRCGVSTSL*TVFNILCQ 
14581 - AGCCACCGTAAAACTTGCTTGTTCCAATTACCACAGTAGCTCCTGTAGTGGCGGCTATTG ~ 14640 
-SHRKTCLFQLPQ*LL*WRLL 
-ATVKLACSNYHSSSSSGGY* 
PP*NLLVPITTVAPLVAAID 
14 641 - ACTTCAATAATTTCTGATGAAACTGTCTATTTGTCATAGTACTACAGATAGAGACACCAG - 147 00 
-TSIISDETVYLS*YYR*RHQ 
-LQ*FLMKLSICHSTTDRDTS 
FNNF* *NCLFVIVLQI ETPA 
14 701 - CTACGGTGCGAGCTCTATTCTTTGCACTAATGGCATACTTAAGATTCATTTGAGTTATAG - 14760 
-LRCELYSLH*WHT* DSFEL* 
-YGASSILCTNGILKIHLSYS 
TVRALFFALMAYLRFI*VIV 
147 61 - TAGGGATGACATTACGCTTAGTATACGCGAAAAGTGCATCTTGATCCTCATAACTCATTG - 14 820 
-*G*HYA*YTRKVHLDPHNSL 
-RDDITLSIREKCILILITH* 
GMTLRLVYAKSAS * S S * L I E 
14 821 - AGTCATAATAAAGTCTAGCCTTACCCCATTTATTAAATGGGAAACCAGCTGATTTATCCA - 148 80 
-SHNKV* PYPIY*MGNQLIYP 
-VIIKSSLTPFIKWETS*FIQ 
S * * SLALPHLLNGKPADLSR 
14 881 - GATTGTTAACGATTACTTGGTTGGCATTAATACAGCCACCATCGTAACAATCAAAGTATT - 14 940 
-DC* RLLGWH* YSHHRNNQSI 
-IVNDYLVGINTATIVTIKVF 
LLT ITWLALIQPPS*QSKYL 
14941 - TATCAACAACTTCAACTACGAATAGGAGTTGTCTGATATCACACATTGTTGGCAGATTAT - 15000 
-YQQLQLRIGVV * YHTLLADY 
-INNFNYE*ELSDITHCWQII 
STTSTTNRSCLISHIVGRL* 
15001 - AACGATAATAGTCATAATCACTGATAGCAGCGTTGCCATCCTGAGCAAAGAAGAAGTGTT - 15060 
-NDNSHNH**QRCHPEQRRSV 
-TIIVIITDSSVAILSKEEVF 
R**S*SLIAALPS*AKKKCF 
15061 - TTAGTTCAACAGAACTTCCTTCCTTAAAGAAACCTTTAGACACAGCAAAGTCATAAAAGT - 15120 
-LVQQNFLP*RNL*TQQSHKS 
-*FNRTSFLKETFRHSKVIKV 
SSTELPSLKKPLDTAKS*KS 
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15121 - CTTTATTAAAATTACCGGGTTTGACAGTTTGAAAAGCAACATTGTTTGTTAGTGCAGCTA - 15180 
-LY*NYRV*QFEKQHCLLVQL 

- FIKITGFDSLKSNIVC*CSY 

LLKL PGLTV* KATLFVSAAT 
15181 - CTGAAAAGCATGTAGTGCGTTTATCTAGCAATAAATTGCCAGAAGCTGCATGCATAGCTG - 1524 0 
-LKSM*CVYLAINCQKLHA*L 

-*kacsafi*q*iarscmhsw 
ekhvvrls snkl peaac i a g 
15241 - gatcagcagcatacactaaaagttccttgaaactgagacgcgagctatgtaagtttacat - 15300 
-dqqhtlkvp*n*dasyvslh 
-issih*kfletetram*vyi 
saaytksslklrrelckfts 
15301 - cctgattatgtacgactcctaactcacgaaaatggtatccagttgaaacaacaaaaggaa ~ 15360 
-pdyvrllthengiqlkqqke 
-limyds*ltkmvss*nnkrn 
*lcttpnsrkwypvettkgt 
15361 - caccatctacaaatatttttcttactagtggtccaaaacttgtaggtggaaacacagtag - 15420 
-hhlqiffllvvqn:l*vetq* 

-TIYKYFSY*WSKTCRWKHSR 
PSTNIFLTSGPKLVGGNTVE 
15421 - AAAATAACACATTAAAGTTTGCACAATGAAGGATACACCTATCATCCAAACAGTTAATAC - 15480 
-KITH*SLHNEGYTYHPNS*Y 
-K*HIKVCTMKDTPI IQTVNT 
NNTLKFAQ* RIHLS SKQLIQ 
15481 - AATTGGGATGGTATGTCTGGTCCCAATATTTAAAATAACGGTCGAAGAGACAAAGTCTCT - 15540 
-NWDGMSGPNI*NNGRRDKVS 
-I GMVCLVPIFKI TVEETKSL 
LGWYVWSQYLK* RSKRQSLS 
15541 - CTTCCGTAAAATCATATTTCAGCAAATCCCACTTAATAAGTGGTTTTGCGAGATCAGCAT - 15 600 
-LP*NHISANPT* *VVLRDQH 

- F R K I IFQQIPLNKWFCEI SI 

SVKSYFSKSHLI SGFARSAS 
15601 - CCATATGGGACTCAGCAGCCAATGCCCTAGTCAAAGTGAGGATGGGCATCAGCAATGAGT - 15660 
-PYGTQQPMP* SK*GWASAMS 
-HMGLSSQCPSQSEDGHQQ*V 
IWDSAANALVKVRMGISNE* 
15661 - AATATGAATCCACAATAGGAACTCCGCAGCCTGGTGCTACTTGTACGAAATCACCGAAAT - 1572 0 
~NMNPQ*E LRSLVLLVRNHRN 
-I *IHNRNSAAWCYLYEITEI 
YEST IGTPQPGATCTKSPKS 
15721 - CGTACCAGTTCCCATTAAGATCCTGATTATCTAATGTCAGTACGCCTACAATGCCTGCAT - 15780 
-RTSSH*DPDYLMSVRLQCLH 
-VPVPIKILII*CQYAYNACI 
YQFPLRS*LSNVSTPTMPAS 
15781 - CACGCATAGCATCGCAGAATTGTACAGTCTTTAATAATGATTGGCGTACACGCTCACCTA - 15840 
-HA*HRRIVQSLIMIG VHAHL 
-THSIAELYSL* * *LAYTLT* 
RIASQNCTVFNN DWRTRSPK 
15841 - AGTTAGCATATACGCGTAAGATGTCAGGATTCTCTACGAAGTCATACCAATCCTTCTTAT - 15900 
~S * H I RVRCQDSLRSHTNPSY 
-VSIYA*DVRILYEVIPILLI 
LAYTRKMSGFSTKSYQSFLL 
15901 - TGAAATAATCATCATCACAGCAATTGTATGTGACGAGTATTTCTTTTAATGTATCACAAT - 15960 

- * NNHHHS NCM* RVFLLMYHN 
-EIIIITAIVCDEYFF*CITI 

K*SSSQQLYVTSISFNVSQL 
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15961 - TACCCTCATCAAAATGACGTAGAGCATAGACTAAATCAGCCATTGTGTATTTAGTTAGAC - 16020 
-YPHQNDVEHRLNQPLCI * LD 
-TLIKMT*SID*ISHCVFS*T 
PSSK*RRA*TKSAIVYLVRR 
16021 - GCTGACGTGATATATGTGGTACCATGTCACCATCTACTCTAAACTTGAAAAAGTCATGGA - 16080 
~ A D V I YVVPCHHLL * T * KSHG 
-LT*YMWYKVTIYSKLEKVMD 
*RDI CGTMSPSTLNLKKSWT 
16081 - CAGCAACCGCTGGACAATCTTTAACCAAGTTATAAATAGTCTCTTCATGTTGGTAGTTAG - 1614 0 
-QQPLDNL* P S Y K ^ SLHVGS * 
-SNRWTIFNQVINSLFMLVVR 
ATAGQSLTKL* IVSSCW*IiD 
16141 - ACATAGTATGCCTCTTAACTACAAAGTAAGAGTCTAATAAATTGCCTTCCTCATCCTTCT - 16200 
-T*YAS*LQSKSLINCLPHPS 

- HSMPLNYKVRV** IAFL I LL 

IVCLLTTK^ESNKLPSSSFS 
16201 - CCTGGAAGCGACAGCAATTAGTTTTTAGGAACTTTGCAAAACCAGCACTTTTTTCGTTGT - 16260 
-PGSDSN* FLGTLQNQHFFRC 

- L E A T A I SF*ELCKTSTFFVV 

WKRQQLVFRNFAKPALFSL* 
16261 - AAATATCAAAAGCCCTGTAGACGACATCAGTACTAGTGCCTGTGCCGCACGGTGTAAGAC - 16320 
-KYQKPCRRHQY* C L C R T V * D 
-NIKSPVDDISTSACAARCKT 
ISKAL*TTSVLVPVPHGVRR 
16321 - GGGCTGCACTTACACCGCAAACCCGTTTAAAAACGTTGATGCATCCGCAGACTGCATCAA - 16380 
-GLHLHRKPV* K R * CIRRLHQ 
-GCTYTANP FKNVDASADCIK 
AALT PQTRLKTLMHPQTASR 
16381 - GGGTTCGCGGAGTTGGTCACAACTACAGCCATAACCTTTCCACATTCCGCAGACGGTACA - 16440 
-GFAELVTTTAITFPHSADGT 
-GSRSWSQLQP*PFHIPQTVQ 
VRGVGHNYSHNLSTFRRRYR 
16441 - GACTGTGTTTCTAAGTGTAAAACCCACTGGGTCATTAGCACAAGTGGTAGGTATTTGGAC - 16500 
-DCVS KCKTHWV I STSGRYLD 
-TVFLSVKPTGSLAQVVGIWT 
LCF*V*NPIiGH*HKW* VFGR 
16501 - GTACTTACCTTTCAAGTCACAGAATCCTTTAGGATTTGGATGGTCAATGTGGCATCTACA - 16560 
-VLTFQVTES FRIWMVNVAST 
-YLPFKSQNPLGFGWSMWHLQ 
TYLS SHRII»*DLDGQCGIYN 
16561 - ATACAGACAACATGAAGCACCACCAAAGGACTCTTGGTCCATGTTAGCTTCTGGTGTTAC - 16620 
-IQTT* STTKGLLVHVSFWCY 
-YRQHEAPPKDSWSMLASGVT 
TDNMKHHQRTLGPC* LLVLQ 
16621 - AGTAATTGCCTGTCCTGTACCAGTGTGTGTACACAACATCTTCACACAGTTGGTGATTGG - 16680 
-SNCLSCTSVCTQHLHTVGDW 
-VIACPVPVCVRNIFTQLVIG 
*LPVLYQCVYTTSSHSW*LV 
16681 - TTGTCCTCCACTTGCTAGGTAATCCTTATATGCTTTAGCAGGGTCTACTGCAAAAGCACA - 16740 
-LSSTC*VILICFSRVYCKST 
-CPPLAR* S LYALAGSTAKAQ 
VLHLLGNPYML*QGLLQKHR 
16741 - GAAGGAAAGCACAGTTGAATTGGCAGGTACTTCTGTAGCATTTCCAGCCTGAAGACGTAC - 16800 
-EGKHS * IGRYFCS ISSLKTY 
-KESTVELAGTSVAFPA* RRT 
RKAQLNWQVLL*HFQPEDVL 
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16801 - TGTAGCAGCTAAACTGCCCAGCACCATACCTCTATTTAGGTTGTTTAAGCCTTTGATGAA - 16860 
-CSS*TAQHHTSI*VV*AFDE 
-VAAKLPSTIPLFRLFKPLMK 
*QLNCPAPYLYLGCLSL**S 
168 61 - GTACAAGTATTTCACTTTAGGCCCTTTTGGTGTGTCTGTAACAAACCTACAAGGTGGTTC - 1692 0 
-VQVFHFRP FWCVCNKPTRW F 
-YKY FTLGPFGVSVTNLQGGS 
TSrSL*ALLVCL*QTYKVVP 
16921 - CAGTTCTGTGTAAATTGTACCTGTACCATCACTCTTAGGGAATCTAGCCCATTTGAGATC - 16980 
-QFCVNCTCTITLRESS PFE I 
-SSV*IVPVPSLLGNLAHLRS 
VLCKLYLYHHS*GI * P I * D L 
16981 - TTGGTGGTCTGATAGTAATGCCAGCACAAACCTACCTCCCTTCGAATTGTTATAGTAGGC - 17 04 0 
-LVV* **CQHKPTSLRIVIVG 
-WWS DSNASTNLPPFELL* * A 
GGLIVMPAQTYLPSNCYSRQ 
17041 - AAGTGCATTGTCATCAGTACAAGCTGTTTGTGTGGTACCAGCCGCACAGGACATCTGTCG - 17100 
-KCIVISTSCLCGTSRTGHLS 
-SALSSVQAVCVVPAAQDICR 
VHCHQYKLFVWYQPHRTSVV 
17101 - TAGTGCTACTGGACTCAGTTCATTATTCTGTAGTTTAACAGCTGAGTTGGCTCTTAGAGC - 17160 
-*CYWTQFIIL*FNS*VGS*S 
-SATGLSSLFCSLTAELALRA 
VLLDSVHYSVV*QLSWIiLEL 
17161 - TGTAACAATAAGAGGCCAAGCCAAATTTGGTGAATTGTCCATGTTAATTTCACTAAGTTG - 17220 
-CNNKRPSQIW* IVHVNFTKL 
-VTIRGQAKFGELSMLISLS* 
*Q*EAKPNLVNCPC*FK*VE 
17221 - AACAATCTTGCTATCCGCATCAACAACTTGCTGGATTTCCCAGAGTGCAGATGCATATGT - 17280 
-NNLAIRINNLLDFPECRCIC 
-TILLSASTTCWISQSADAYV 
QSCYPHQQLAGFPRVQMHM* 
17281 - AAAGGTGTTACCATCACAAGTGTTCTTGTAGGTACCATAATCAGGGACAACAACCATGAG - 17340 
-KGVT I TSVLVGT I IRDNNHE 
-KVLPSQVF L*VP*SGTTTMS 
RCYHHKCSCRYHNQGQQP*V 
17341 - TTTGGCTGGTGTAGTCAATGGTATGATGTTGAGTGGAACACAACCATCAGGCGCATTGTT - 17400 
-FGCCSQWYDVEWNTTITRIV 
-LAAVVNGMMLSGTQPSRALL 
WLL*SMV* C*VEHNHHAHC* 
17 4 01 - GATAATGTTGTTAAGTGCATCATTATCAAGCTTCCTAAGCATAGTGAAGAGCATTGTTTG - 17 4 60 
-DNVVKCI IIKLPKHSEEHCL 
-IMLLSASLSSFLSIVKSIVC 
*CC*VHHYQAS*A* * R A L F A 
17 4 61 - CATAGCACTAGTTACTTTTGCCCTCTTGTCCTCAGATCTTGCCTGTTTGTACATTTGGGT - 17520 
-HS TS YFC PLVLR SCLFVHLG 
IALVTFALLSSDLACLYIWV 
*H*LLLPSCPQILPVCTFGS 
17 521 - CATAGCCTGATCTGCCATCTTTTCCAACTTGCGTTGCATGGCAGCATCACGGTCAAACTC - 17580 
-HSLI CHLFQLALHGSITVKL 
- I A * SAI FSNLRCMAASRSNS 
*PDLPSFPTCVAWQHHGQTQ 
17581 - AGATTTAGCCACATTCAAAGATTTCTTTAACTTTTTGAGAACGACTTCAGAATCACCATT - 17640 
-*RFSHIQRFL*LFENDFRITI 
-DLATFKDFFNFLRTTSES PL 
I*PHSKISLTF*ERLQNHH* 
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17 641 - AGCTACAGCCTGCTCATAGGCCTCCTGGGCAGTGGCATAAGCGGCATATGATGGTAAAGA - 17700 

-SYSLLIGLLGSGISGI*W*R 
-ATACS*ASWAVA*AAYDGKE 
LQPAHRP PGQWHKRHMMVKN 
17701 - ACTAAATTCTGAAGCAATAGCCTGAAGAGTAGCACGGTTATCGAGCATTTCCTCGCACAA - 17760 

- T K F * SNSLKSSTVIEHFLAQ 
-LNSEAIA*RVARLSSISSHN 

* I L K Q * PEE*HGYRAFPRTT 

177 61 - CCTATTAATGTCTACAGCACCCTGCATGGATAGCAAAACAGACAAAAGAGAAACCATCTT - 17 82 0 

- P INVYSTLHG* QNRQKRNHL 
-LLMSTAPCMDSKTDKRETIF 

Y * CLQHPAWIAKQTKEKPSS 
17821 - CTCGAAAGCTTCAGTTGTGTCTTTTGCAAGAAGAATATCATTGTGGAGTTGTACACATTG - 17 880 
-LESFSCVFCKKN I IVELYTL 
-SKASVVSFARRISLWSCTHC 
RKLQLCLLQEEYHCGVVH IV 
17881 - TGCCCACAATTTAGAAGATGACTCTACTCTAAGTTGTTGAAGAACCGAGAGCAGTACCAC - 17 940 
-CPQFRR* LYSKLLKNREQYH 
-AHNLEDDSTLSC*RTESSTT 
PTI*KMTLL*VVEEPRAVPQ 
17941 - AGATGTGCACTTTACGTCAGACATTTTAGACTGTACAGTAGCAACCTTGATACATGGTTT - 18 000 
-RCALYVRHFRLYSSNLDTWF 
-DVHFTSDILDCTVATLIHGL 
MCTLRQTF*TVQ*QP*YMVY 

18 001 - ACCTCCAATACCCAACAACTTAATGTTAAGCTTGAAAGCATCAATACTAGTCTTAGGAGG - 18 060 

-TSNT QQLNVKLESINTTLRR 

- P P I PNNLMLSLKASILLLGG 

LQYPTT*C*A*KHQYYS*EA 
18061 - CAAAAGCCCCTGGGAGTTCATATACCTAAATTCTTGTGTAGAGACCAAGTAGTCATAAAC - 18120 
-QKPLGVHI PKFLCRDQVV IN 
-KSPWEFIYLNSCVETK*S*T 
KAPGSSYT*ILV*RPSSHKH 
18121 - ACCAAGAGTAAGCCTGAAGTAACGGTTGAGTAAACAGAAAAGGCCAAAGTAGCAGCAGCA - 18180 
-TKSKPEVTVE* TEKAKVAAA 
-PRVSLK*RLSKQKRPK*QQQ 
QE*A*SNG*VNRKGQSSSSN 
18181 - ACAATAGCCTAAGAAACAATAAACAAGCATGATACACTGTAAGGTGTTGCCAGTAATAAA - 1824 0 

- T I A * ETINKHDTL*GVASNK 
-Q*PKKQ*TSMIHCKVLPVIN 

NSLRNNKQA* YTVRCCQ* * I 
18241 - TAACAATGGGTAATACTCAACACACACAAACACTATAGCTCTAGCTAAAAACATGATAGT - 18 300 

- * Q W V ILNTHKHYSSS* KH DS 

- N N G * YSTHTNTIALAKNMIV 

TMGNTQHTQTL*L*LKT**S 
18301 - CGTAACGACACCAGAATAGTTAGAGGTTACAGAAATAACTAAGGCCCACATGGAAATAGC - 18360 
-RNDTRIVRGYRNN*GPHGNS 

- V T T P E * LEVTEITKAHMEIA 

* R H Q N S * R L Q K * LRPTWK*L 

18361 - TTGATCTAAAGCATTACCATAGTAGACTTTGTAAACAAGTGTAATGACATTCATCAGTGT - 18 420 

- L I * S ITIVDFVNKCNDIHQC 
-*SKALP**TL*TSVMTFISV 

DLKHYHSRLCKQV**HSSVS 
18 421 - CCAAACACGTCTAGCAGCATCATCATAAACAGTGCGAGCTGTCATGAGAATAAGCAAAAC - 184 80 
-PNTSSSI I INSASCHENKQN 
-QTRLAASS *TVRAVMRISKT 
KHV* QHHHKQCELS* E * A K L 
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18481 - TAAAGCTGAAGCATACATAACACAATCCTTAAGCCTATAACCAGACAAGCTAGTGTCAGC - 18540 
-*S*SIHNTILKPITRQASVS 
-KAEAYITQSLSL* PDKLVSA 
KLKHT*HNP*AYNQTS*CQP 

18541 - CAATTCAAGCCATGTCATGATACGCATCACCCAGCTAGCAGGCATGTAGACCATATTAAA - 18 600 
-QFKPCHDTHHPASRHVDHIK 
-NSSHVMIRITQLAGM* TILK 
IQAMS*YASPS*QACRPY*S 

18601 - GTAAGCAACTGTTGCAAGAGAAGGTAACAGAAACAAGCACAAGAATGCGTGCTTATGCTT - 18660 
-VSNCCKRR * QKQAQECVLML 

- *ATVAREGNRNKHKNACLCL 

KQLLQEKVTETSTRMRAYA* 
18661 - AACAAGCAGCATAGCACATGCAGCAATTGCCATAATACCAAGAGTAAATGGCAAGAAAGC - 18720 
-NKQHST CSNCHNTKSKWQES 
-TSSIAHAAIAIIPRVNGKKA 
QAA*HMQQLP*YQE*MARKH 
18721 - ATTCTCGTAAACAAAGAAAAACAGTGACCACTGTGTACTTTGAACAAGAATCAATAGTGA - 1878 0 
-ILVNKEKQ* PLCTLNKNQ* * 
-FS^TKKNS DHCVL*TRINSD 
SRKQRKTVTTVYFEQESIVM 
18781 - TGTCAAGAAAGTTAAAAGCATCCAATGATGAGTGCCCTTAACAATTTTCTTGAACTTACC - 18 840 
-CQES * KHPMMSALNNFLELT 
-VKKVKSIQ**VPLTIFLNLP 
SRKLKASNDECP*QFS*TYL 
18 841 - TTGGAAGGTAACACCAGAGCATTGTCTAACAACATCAAATGGTGTAAACTCATCTTCTAA - 18 900 
-LEGNTRALSNN X KWCKLI F * 
-WKVTPEHCLTTSNGVNSSSK 
GR*HQSIV*QHQMV*THLLK 
18901 - AATAGTGCTACCAAGGATAGTACGACCATTCATACCATTCTGCAGCAGCTCTTTCAAAGC - 18960 
-NSATKDSTTIHTILQQLFQS 

- IVLPRIVRPFIPFCSSSFKA 

*CYQG*YDHSYHSAAALSKQ 
18 961 - AGCACACATATCTAAGACGGCAATTCCTGTTTGAGCAGAAAGAGGTCCCAATATGTCAAC - 19020 

- S T H I * DGNSCLSRKRSQYVN 
-AHI SKTAI PV*AERGPNMST 

HTYLRRQFLFEQKEVP ICQH 
19021 ~ ATGATCTTGTGTCAAAGGTTCATAGTTGTACTTCATTGCCACAAGGTTAAAGTCATTCAA - 1908 0 
-MILCQRFIVVLHCHKVKVIQ 
* S C V K G S * LYFIATRLKSFK 
DLVSKVHSCTSLPQG*SHSK 
19081 - AGTAGTGGTGAATCTATT2VAGAAACCACCTATCACCATTGATAACAGCAGCATACAGCCA - 19140 
-SSGESIKK PPITIDNSSIQP 
-VVVNLLRNHLSPLITAAYSH 
*W*IY*ETTYHH**QQHTAM 
19141 - TGCCAAAACATTTAATGTTATGGTTGTGTCTGTACCTGCAGCCTGTGCAGTTTGTCTGTC - 19200 
-CQNI*CYGCVCTCSLCSLSV 
-AKTFNVMVVSVPAACAVCLS 
PKHLMLWLCLYLQPVQFVCQ 
19201 - AACAAATGGACCATAGAATTTACCTTCTAAGTCAGTACCAGCGTGTACTCCTGTTGGAAG - 19260 
-NKWTIEFTF*VSTSVYSCWK 
-TNGP*NLPSKSVPACTPVGS 
QMDHRIYLLSQYQRVLLLEA 
19261 - CTCCATATGATGCATATAGCAGAAAGACACGCAATCATAATCAATGTTAAAACCAACACT - 19320 
-LHMMH IAERHAI I INVKTNT 
-SI*CI*QKDTQS*SMLKPTL 
PYDAYSRKTRNHNQC*NQHY 
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19321 - ACCACATGATCCAT TAAGGAAAGAACCTTTAAT GGT ATGAT T AGGTCT CATGGCACACTG - 1938 0 

- T T * S IKERTFNGMIRS HGTL 

PHDPLRKEPLMV*LGLMAH* 
HMIH*GKNL*WYD*VSWHTD 
19381 - ATAAACACCAGATGGTGAACCATTGTAGCATGGTAGAACTGAAAATGTTTGACCAGGTTG - 19440 
-INTRW* TIVAC*N* KCLTRL 

- *TPDGEPL*HARTENV* PGW 

KHQMVNHCSMLELKMFDQVG 
19441 - GATACGGACAAATTTATACTTGGGTGTCTTAGGGTTAGAAGTATCAACTTTAAGCCTAAG - 19500 

- D T D K F I LGCLRVRS IN F K P K 
-IRTNLYLGVLGLEVSTLSLS 

YGQIYTWVS*G*KYQL*A*A 
19501 - CAGACAATTTTGCATAGAATGGCCAATAACACGAAGTTGAACATTGCCAGCCTGAACAAG - 19560 
-QTILHRMANNTKLNIASLNK 
~RQFCIEWPITRS*TLPA*TR 
DNFA*NGQ*HEVEHCQPEQE 
19561 - AAAGCTATGGTTGGATTTGCGAATGAGCAGATCTTCATAGTTAGGATTAAGCATGTCTTC - 19620 
-KAMVGFANEQ I FI VRI KHVF 
-KLWLDLRMSRSS *LGLSMSS 
SYGWICE*ADLHS* D * A C L L 
19621 - TGCTGTGCAAATGACATGTCTTGGACAGTATACTGTGTCATCCAACCACAATCCATTAAG - 19680 
-CCANDMSWTVYCVIQPQSIK 
-AVQMTCLGQYTVSSNHNPLR 
LCK*HVLDSILCHPTTIH*"E 
19681 - AGTTGTAGTTCCACAGGTTACTTGTACCATGCACCCTTCAACTTTGCCTGACGGGAATGC - 19740 
-SCS STGYLYHAPFNFA*REC 
-VVVPQVTCTMHPSTLPDGNA 
L * FHRLLV PCTLQLCLTGMP 
19741 - CATTTTCCTAAAACCACTCTGCAGAACAGCAGAAGTGATTGATGTCTGTGGTGGTTGGTA - 19800 
~HFPKTTIiQNSRSD*CLWWLV 

- IFLKPLCRTAEVIDVCGGW* 

FS *NHSAEQQK* LMSVVVGR 
19801 - GAGAACATCAGCACCTGAGTTGCTAAAGTCATTTAGAGCCTTTGCTAAGTGGCAGCAAGC - 19860 
~ENIST*VAKVI*SLC*VAAS 
RTSAPELLKS FRAFAKWQQA 
EHQHLSC* SHLEPLLSGSKL 
198 61 - TGCTTCACGATAGCTGGTAGTATCTAAGGCTCCACTGAAATACTTGTACTTGTTATATAG - 19920 
-CFTIAGSI*GSTEILVLVI* 
-ASR*LVVSKAPLKYLYLLYR 
LHDSW*YLRLH*NTCTCYIE 
19921 - AGCAAGATACCTGTTATACTGTGTAAGTGGCAACAGTGTCTCGCTACGCAATTTTAGGTA - 19980 
-SKI PVILCKWQQCLATQF*V 
-ARYLLYCVSGNSVSLRNFRY 
QDTCYTV*VATVSRYAILGT 
19981 - CATTTCCTTGTTGAGCAAAAAGGTACACAAAGCAGCCTCCTCGAAGGTACTAAATGTAAC - 2004 0 
-HFLVEQKGTQSSLLEGTKCN 

- ISLLSKKVHKAASSKVLNVT 

FPC*AKRYTKQPPRRY*M*L 
20041 - TCCATTAAACATGACTCTTTTCCTAAGATAGTTGTTAAAGAACCAATGGCAGTGCTTCAG - 20100 

- S IKHDSFPKIVVKEPMAVLQ 

- PLNMTLFLR* LLKNQWQCFR 

H*T*LFS*DSC*RTNGSASE 
20101 - AGAAATACAGAATACATAGATTGCTGTTATCCAAAAAGGCACAATAGGAGAAAACATGGC - 20160 
-RNTEYI DCCYPKRHNRRKHG 
-EIQNT* IAVIQKGTIGENMA 

KYRIHRLLLSKKAQ*EKTWQ 
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20161 - AAACCATTGAAGGTGAGCCAAGAATGAAACATCATTGGTGAAATAGAATGTCAAGTACAA - 2022 0 
-KPLKVSQE*NI IGEIECQVQ 
-NH*R*AKNETSLVK*NVKYK 
TIEGEPRMKHHW*NRMSSTS 

20221 - GTAAAAGACTGAGTAGACTCCCGGCAGAAAGCTGTAAGCTGGTACCAGACAGAGTATAGT - 20280 

- V K D * VDSRQKAVSWYQTEYS 
-*KTE*TPGRKIi*AGTRQSIV 

KRLSRLPAESCKLVPDRV** 
2 0281 - GAAAGACATCAAAAACAAAAGTGCATTAGCAGCAACAACATGGTTGTACTCACCAAAAAC - 20340 
-ERHQKQKCI SSNNMVVLTKN 
-KDIKNKSALAATTWLYSPKT 
KTSKTKVH*QQQHGCTHQKH 
2 0341 - ACGTCTGAATTTCATAAAGTAGTAGGCAGCACAAGTCACCAATATGGCAATAATACCACC - 20400 
-TSEFHKVVGSTSHQYGNNTT 

- R L N F I K * *AAQVTNMAI IPP 

V*IS*SSRQHKSPIWQ*YHQ 
2 04 01 - AGCCACTACTGAAGCAGACACATCTAAAGCACCCACAGGTTGCACAAGAGGAGTAAAGAT - 20460 
-SHY* SRH I * STHRLHKRSKD 

- ATTEADTSKAPTGCTRGVKM 

PLLKQTHLKHPQVAQEE*RC 
20461 - GTTAGCTATGAGATTCATCGCATCAACACCACAGAAAACTCCTGATAGAGCTCTGTAATG - 20520 
-VSYEIHRINTTENS**SSVM 

- LAMR F IA S T P QKT P ORAL,* C 

* L * DSSHQHHRKLLIELCNA 
20521 - CTCATTATTAAGAACCCATCTACCACTGGTAGATAGGCAAATACCTACTTCTGACCTTTC - 20580 
-LIIKNPSTTGR*ANTYF*PF 

- SLLRTHLPLVDRQI PTSDLS 

HY^EPIYHW* IGKYLLLTFR 
2 0581 - GCATGTACCATGTCTACAGTACTCAGCATCAAAAGTTGTTACTACTCTAACAGAACCCTC - 20640 
-ACTMSTVLS IKSCYYSNRTL 
-HVPCLQYSASKVVTTLTEPS 
MYHVYSTQHQKLLLL* QNPP 
2 064:. - CAGGTAAGTGTTAGGAAACTGTATGATGGAACCATCCATAAGCACATAACGAGTGTCTGG - 20700 
-QVSVRKLYDGTIHKH ITSVW 
-R*VLGNCMMEPSIST*RVSG 
GKC*ETV*WNHP*AHNECLD 
2070:. - ACGAAGCTCACTATAAGAAATAGAACCCTCTAGCAAATTAGTGTCATAACAATATGGCAC - 20760 
-TKLT I R N R T L * QI SVITIWH 
-RSSL*EIEPSSKLVS*QYGT 
EABYKK^NPLAN* CHNNMAQ 
20761 - AGGTTTGCCCATAGCATCCTTAAAAATTGTACACTCAGCAGCAAGAACGGAAGCAGAGGT - 20820 
-RFAHSILKNCTLSSKNASRG 
-GLPIASLKIVHSAARTQAEV 
VCP*HP*KLYTQQQERKQR* 
20821 - AGCAAAATCACTATACTCAATGAGTTTGGAAGGTGTGTAGCAAATGTTGCCAACAGCACT - 20880 
-SKI TILNEFGRCVANVANST 

- AKSLYSMSLEGV*QMLPTAL 

QNHYTQ*VWKVCSKCCQQH* 
20881 - AAAAACACGAGGTAGAAAATGCAAGAAGTCACCATTGATTGCTCTCAGCACAGTACCCGG - 20940 
-KNTR*KMQEVTI DCSQHSTR 
-KTRGRKCKKSPLIALSTVPG 
KHEVENARSHH* LLSAQYPV 
20 941 - TAAGCCAGGCACTATGAAACCAATCTCTCTTGTAATGATAGCAGCTACTACAGGGCAGCT - 21000 
-*ARHYETNLSCNDSSYYRAA 
-KPGTMKP ISLVMIAATTGQL 
SQAL*NQSLL* * *QLLQGSF 
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21001 - TTTGTCATTTTTGTATGAACCACCACGCTGGCTAAACCATGCGTCAAAACCAGCATGTTT - 21060 

- F V I F V * TT TLAKPCVKTSMF 
-LSFLYEPPRWLNHASKPACL 

CHFCMNHHAG* TMRQNQHVY 
21061 - ATTTGCAAAACAATCATCAGTAGAAATGATGTCACGAGTGACACCATCCTGAATGGCTTT - 21120 
-ICKTIISRNDVTSDTILNGF 
-FAKQSSVEMMSRVTPS*MAL 
L Q N N H Q * K*CHE*HHPEWLC 
21121 - GTAACCAATGATTTCATTTGTGTAACCATCATGGATTGACAATGTATGTACTGGCATAAC - 21180 
-VTN DFI C V T I M D * QCMYWHN 
-*PMISFV*PSWIDNVCTGIT 
NQ*FHLCNHHGLTMYVLA*R 
21181 - GATATAACAAACCAATGCAGCAAGAACGCACAATAATGTGGCCTTAAGCATAAGTTTAAA - 21240 

- D I TNQCSKNAQ*CGLKHKFK 
-I*QTNAARTHNNVALSISLK 

YNKPMQQERTIMWP*A*V*N 
21241 - ACAAGTACTAACAATCTTACCACCCTTGAGTGAGATTTTAGTAGTTATGACATTGACAAC - 21300 

- T S TNNLTTLE* DFSSYDI DN 
-QV.LTILPPLSEILVVMTLT-'T 

KY*QSYHP*VRF**L*H*QP 
21301 - CTGTCTAGTTGTAGCACAAGTTAGTGTAAAAGGTATGTTGTTCTTCTTGGCAGCAGTACG - 213 60 
~LSSCSTS*CKRYVVLLGSST 
-CLVVAQVSVKGMLFFLAAVR 
V*L*HKLV*KVCCSSWQQYE 
213 61 - AATTTGTTTACGCAGCTGTTCAGATAAAGACATGTAGTCTTTTACATTCCAGATGAGTGA - 21420 
-NLFTQLFR*RHVVFYIPDE* 
~ICLRSCSDKDM*SFTFQMSE 
FVYAAVQIKTCSLLHSR*VK 
21421 - AACATTGTGACTTTTTGCTACTTGGGCATTGATATGCCTTGCATTACAGTCAATACATGC - 21480 
-NIVTFCYLGIDMPCITVNTC 
-TL*LFATWALICLALQSIHA 
HCDFLLLGH*YALHYSQYMR 
21481 - GCCAAGATCTCTGGGCGTCATGTTTTCAACCTTATTATAGGTGAGCATGAAATTGTTACA - 21540 

- A K I S GRHVFNLI I GEHE IVT 
-PRSLGVMFSTLL*VSMKLLQ 

QDLWASCFQPYYR*A*NCYN 
21541 - ACTGTCACCTGTCACTTCTAAGTCAGAGTGATGTGAAAGTTTGAGACATTCAATAACATC - 21600 
-TVTCHF*VRVM* KFETFNNI 
-LSPVTSKSE*CESLRHSITS 
CHLSLLSQSDVKV* D I Q * H P 
'21601 - CTTTGTGTCAACATCGGTATCAACAACACCTTGTCGGGCAGCTGACACGAATGTAGAAAG - 21660 
-LCVNIGINNTLSGS* HECRK 

- FVST SVSTTPCRAADTNVER 

LCQHRYQQHLiVGQLTRM* KG 
21661. - GACACCATCTAAAGCTACACCCTTTGCTAACTCGCTGTGAGGTGTAGCAACAAGTGCCTT - 21720 
-DTI * SYTLC*LAVSCSNKCL 

- T P S KATP FAN S L * A V A T SAL 

HHLKLHPLLTRCEL*QQVP* 
21721 - AAGTTTTTCCATAGGAACACTAAAAGTTGCTGAAAAGGTGTCGACATAAGCATCAAACAT - 21780 
-KFFHRNTKSC* KGVDISIKH 
-SFSIGTLKVAEKVST*ASNI 
VFP*EH*KLLKRCRHKHQTS 
217 81 - CTTAACGGAAACTTCAGTACTATCTCCAACGTTTGATACAAGAGCTTGGTCAAGCAACAG - 21840 
-LNGNFSTI SNV*YKSLVKQQ 
-LTETSVLSPTFDTRAWSSNR 
*RKLQYYLQRLIQELGQATE 
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21841 - AATAGGTTGGCACATCAGCTGACTGTAGTACACAGAAGCAGACTTAGAAGCAGACTCGTC - 21900 
-NRIiAHQLTVVHRS RLRSRLV 
-IGWHIS*L*YTEADLEADSS 
*VGTSADCSTQKQT*KQTRR 
21901 - GCATTTGGACTTGCCATCAAAAACTATGACATTAATAGGCAGTGAACCTTTAGTGTTGTT - 21960 
-AFGLAIKNYDINRQ* T FSVV 
-HLDLPSKTMTLIGSEPLVLL 
IWTCHQKL*H**AVNL*CC* 
21961 - AGCTCTCAAATTGTCTAAATTGACAAAATGGGAGAGCGGATGTCTCTCATAGGTCTTTTG - 22020 
-SSQIV* IDKMGERMSL IGLL 
-ALKLSKLTKWESGCLS + VF* 
LSNCLN*QNGRADVSHRSFD 
22021 - ACCAGCCTTGTCAAAGTAGAGGTGAAGCGCGCCATTTTTCACAGCAACACTATCAACAAT - 22080 
-TSLVKVEVKRAI FHSNTINN 
-PALSK*R*SAPFFTATLSTI 
QPCQSRGEARHFSQQHYQQY 
22081 - ATACGATGACTGGTCAGTAGGGTTGATTGGTCTTTTAAACTGGAGTGACAAATCACGAGC - 22140 
-IR*LVSRVDWSFKLE*QITS 
-YDDWSVGLIGLLNWSDKSRA 
TMTGQ*G*LVF*TGVTNHEQ 
2214:. - AACTTCATCACTAATGAATGTACTACCAGTGCAAAATGTGTCACAATTGAGACAATTCCA - 22200 
-NFITNECTTSAKCVTIETIP 
-TSSLMNVLPVQNVSQLRQFQ 
L H H * *MYYQCKMCHN* DNSN 
22201 - ATTGTGAGTCTTGCAGAAGCCACGGCCTCCATTTGCATAGACATAGAAAGATCTCTTCAT - 22260 
- I V S LAEATAS I CI DI ERSLH 
-L*VLQKPRPPFA*T*KDLFM 
CESCRSHGLHLHRHRKISSC 
222 61 - GCCATTAACAATAGTTGTACACTCAACGCGTGTGGCACGATTGCGCTTATAGCACATCAT - 22320 
-AI N N S CTLNACGT IALIAHH 
-PLTIVVHSTRVARLRL^HIM 
H *Q*LYTQRVWHDCAYSTSC 
22321 - GCAAGTCGAAGAGGTGCAACCATCCATGATATGAACATAGCTCTTCCATATGTAGTAGAA - 22380 
-ASRRGATI HDMNIALPYVVE 
-QVEEVQPSMI*T*LFHM**K 
KSKRCNHP*YEHSSSICSRK 
22381 - AGAAGCAAAG^IAGATGTACATCCTAACCATTGCAGAAACGGGTGCCATTTGTACAATACT - 22440 
-RSKEDVHPNHCRNGCHLYNT 
-EAKKMYILTIAETGAICTIL 
KQRRCTS* PLQKRVPFVQY* 
22441 - AATGATAAACCACATGAGCCAAGAATTGCTGATGAAATGACTAGCAAAATAGCCAAAGAA - 22500 
-NDKPHEPRIADEMTSKIAKE 
-MINHMSQELLMK*LAK*PKN 
* * TT*AKNC* *ND* QNSQRT 
22501 - CACCTGCATTATAGCTGAAAGAGCTAATAAATAAAAGAATTTTGTGAACAACATATATGC - 22560 
-HLHYS*KT**IKEFCEQHIC 
-TCIIAERPNK*KNFVNNIYA 
PAL*LKDLINKRIL*TTYMP 
22561 - CAAAACCCACTCAGCGGCCAGACCTAAAATTGTCAAGTCTAGCTTGTACGATGAAATCGT - 22620 
-QNPLSGQT*NCQV*LVR*NR 
-KTHSAARPKIVKSSLYDEIV 
KPTQRPDLKLSSLACTMKSS 
22621 - CACCTGAATGGTTTCAAGAGCTGGATAAGAATCAAGGGAGTCTAATCCACTTAAACAAAT - 22680 
-HLNGFKSWIRIKGV* S T * T N 
-T*MVSRAG*ESRESNPLKQM 
PEWFQELDKNQGSLIHLNKC 
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22681 - GCTGCAAGGAAAAGAACCTTCACAGAAATCCATAGTAGTAACGTTAGACGAATTAAGATA - 22740 
-AARKRTFTE IH S SNVRRI KI 
-LQGKEPSQKS IVVTLDELRY 
CKEKNLHRNP* * *R*TN*DT 
22741 - CAATTCTCTAACGCCATTACAATAAGAAGGAGCACCAAAATTAGATAAGAGTACACCAAA - 22800 
-QFSNAITIRRSTKIR* EYTK 
-NSLTPLQ* EGAPKLDKSTPK 
IL*RHYNKKEHQN*IRVHQK 
22801 - AGCAGCAGTTACACAGATTAGAGAACCTAAGCAAATACTTAACAACAATAGCCACATAGC - 228 60 
-SSSYTD*RT*ANT*QQ*PHS 
-AAVTQIREPKQILNNNSHIA 
QQLHRLENLSKYLTTIAT*R 
228 61 - GATTGTGAACAATTTAGAAAATTTGGGTGACTTCACATAATTAATGCCGGCATCCAAACA - 22920 
-DCEQFRKFG* LH I INAGIQT 

- I VNNLENLGDFT * LMPASKH 

L*TI*KIWVTSHN* CRHPNI 
22921 - TAATTTAGCAACACTCTTAACACTATTTTTAGCAATAGTTGTAGGTAGTGAAGCTCTAAT - 22980 

- * FSNTLNTIFSNSCR* * S S N 

- NLATLLTLFLAIVVGSEALI 

I*QHS*HYF*Q*L*VVKL*F 
22981 - TCTAGAATTGGTACTTTTAGTAAAAGTACACAATTGGAACAATAATGTAAACACATAAGG - 23040 
-SRIGTFSKS TQLEQ*CKHIR 
-LELVLLVKVHNWNNNVNT*G 
*NWYF**KYTIGTIM*THKA 
23041 - CATATAATTGTTAAACACACGTTGTGCTAATCTCTTAGCGCAATTTGATGTTGTAATTGC - 23100 
-HIIVKHTLC*SLSAI*CCNC 

- I * LLNTRCANLLAQFDVVIA 

YNC*THVVLIS*RNLML*L>L 
23101 - TGCTTGTCCTAAGAATGGTTTGACATAAGCCAAAATTTTACTCCAAGGAACACTATTAAT - 23160 
-CLS*EWFDI SQNFTPRNTIN 
-ACPKNGLT*AKILLQGTLLI 
LVLRMV*HKPKFYSKEHY*L 
23161 - TGCAGCAATACCATGAGTGGCAATTGTTTTTAAACCTAAGGCTAGTGAAAGCTCATTAGG - 23220 
-CSNTMSGNCF*T*G**KLIR 
-AAIP*VAIVFKPKASESSLG 
QQYHEWQLFLNLRLVKAH*V 
232 211 - TTTCTTAATGGTAATGCTTGTGTTTTCCACATAAGCAGCCATAAGATCCTCATGACCTAA - 23280 
-FLNGNACVFH I SSHKILMT* 

- FLMVMLVFST*AAIRS S * P N 

S*W*CLCFPHKQP*DPHDLT 
23281 - CTCTTGTGTTACTTTAACACCTTCATCTGATGGTTTAAGTATGACATTGCGTACAACTTC - 23340 
-LLCYFNTFI * WFKYD IAYNF 
SCVTLTPSS DGLSMTLPTTS 
LVLL*HLHLMV*V* HCLQLR 
23341 - GGTAGTTTTCACGTCACACTCTATGACTTCCTTCTGTATGGTAGGATTTTCCACTACTTC - 23400 
-GSFHVTLYDFLLYGRI FHYF 
-VVFTSHSMTSFCMVGFSTTS 
* FSRHTL* LPSVW*DFPLLIi 
234 01 - TTCAGAGGTGGGTTGTTGACTTTCACAAGCAAGATTGTCCATTCCTTGTGTGTCTTCTAC - 234 60 
-FRGGLLTFTSKIVHSLCVFY 
-SEVGC*LSQARLSIPCVSST 
QRWVVDFHKQ DC PFLVCLLL 
234 61 - TGCCAGAACTTCAAATGAATTTGAAGTATCTACTGGCTTTGTACTCCAAAGACAACGTAA - 2 3520 
-CQNFK*I*SIYWLCTPKTT* 
-ARTSNEFEVSTGFVLQRQRK 
PELQMNLKYLLALYSKDNVN 
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23521 - ACACCAAGTGTTTGGTTTGAACGTTGTCTTGGTTGTAGCCTGGTTAATGTGGCAAACAAr - 23580 
-TPSVWFERCLGCSLVNVPNN 
-HQVFGLNVVLVVAWLMCQTI 
TKCLV*TLSWL*PG*CAKQL 
23581 - TGGCTTATGCAGTAATTTAGCACCTTTCTTGAAACTCGCTGAATAGTGTCTATAGTCAAT - 23640 
-WLMQ* FSTFLETR*IVSIVN 
-GLCSNLAPFLKLAE*CL*SI 
AYAVI*HLS*NSLNSVYSQ* 
23641 - AGCCACTACATCGCCATTCAAGTCTGGGAAGAATGTGACAGATAGCTCTCGTGAAGCTGG - 23700 
-SHYIAIQVWEECDR*LS*SW 
-ATTSPFKSGKNVTDSSREAG 
PLHRHSSLGRM*QIALVKLA 
23701 - CTTTGTGAAGCCTGTCATTTGATTTAAATCATCAGCAAATTTTGTGTTAGAACATGTGAG - 23760 
-LCEACHLI * IISKFCVRTCE 
- FVKPVI*FKSSANFVLEHVS 
h * SLSFDLNHQQILC*NM*V 
23761 - TTTGAAATTATCAAAACTCGCATTTGGTAATGGTTGAGTTGGTACAAGGTCTATAGGCTG - 23820 
-FEIIKTRIW*WLSWYKVYRL 
"~LKLSKLAFGNG*VGTRSIGC 
*NYQNSHLVMVELVQGL*AA 
23821 - CTCTGTATAGTAAGCATTATCCTTTTTATAATACCCATCCAATTTTGGTTCAATCTCTGT - 23880 
-LCIVSIILFIIPIQFWFNLC 
-SV**ALSFL*YPSNFGSISV 
LYSKHYPFYNTHPILVQSLC 

23881 - gtaagtaactccatcgagtttatacgacacaggcttgatggttgtagtgtaagatgtttc - 23940 
-vsns iefi rhrl dgcsvrcf 
-*vtpss:lydtglmvvv*dvs 

K*LHRVYTTQA*WL*CKMFP 
23941 - CTTGTAGAAAACATCAGTCACTGGTCCTTTGTACTCTGACATCTTTGTAAGGTGAGCTCC - 24000 
-LVENISHWSFVL*HLCKVSS 
-L*KTSVTGPLYSDIFVR*AP 
CRKHQSLVLCTLTSL* GELR 
2 4 001 - GTCAATACGATAGAGGGTCTCCTTAGCAGTTATATGAGTGTAATGACCACACTGATAGTT - 24 060 
-VNTIEGLLS SYMSVMTTLI V 
-SIR*RVSLAVI*V**PH**L 
QYDRGSP*QLYECNDHTDSY 
24 0 61 - AGCAGTGTACTCATTCGCACATAAGAATGTACCTTGCTGTAATTTATACTCAGCAGGTGG - 2412 0 
-TSVLIRT* E C T L L * FILSRW 
-PVYSFAHKNVPCCNLYSAGG 
QCTHSHIRMYLAVI YTQQVV 
24121 - TGCAGACATCATAACAAAAGAAGACTCTTGTTGTACTAGATATTGTGTAGCATCACGACC - 2418 0 
-CRHHNKRRLLLY* ILCSITT 

- A D I ITKEDSCCTRYCVASRP 

QTS*QKKTLVVLDIV* HHDH 
2 4181 - ACACACACATGGAATGGAAACACCTGTCTTAAGATTATCATAAGATAGAGTACCCATATA - 24240 
-THTWNGNTCLKIIIR*STHI 
~HTHGMETPVLRLS*DRVPIY 
THMEWKHLS*DYHKIEYPYT 
24241 - CATCAGAGCTTCTACACCCGTTAAGGTAGTAGTTTTCTGACCACAATGTTTACACACCAC - 24300 
-HHSFYTR* GSSFLTTMFTHH 

- ITASTPVKVVVF*PQCLHTT 

SQLLHPLR* *FSDHNVYTPH 
24301 - ATTAAGAACTCGCTTTGCAGATTCCAAATTAGCATGCTGTAGAAGATGGGTCATAGTTTC - 24360 
-IKNSLCRFQISML*KMGHSF 

- LRTRFADS KLACCRRWVIVS 

*ELALQIPN*HAVEDGS*FL 
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243 61 - TCTGACATCACCAAGCTCGCCAACAGTTTTATTACTGTAAGCGAGTATGAGTGCACAAAA - 24 42 0 
-SDITKLANSFITVSEYEGTK 
-LTSPSSPTVLLL*ASMSAQK 
*HHQARQQFYYCKRV*VHKS 

24421 - GTTAGCAGCATCACCAGCACGGGCTCTATAATAAGCCTCTTGAAGTGCTGGTGCATTGAA - 24 480 
-VSSITSTGSIISLLKCWCIE 
-LAASPARAL**AS*SAGALN 
*QHHQHGLYNKPLEVLVH* I 

2 4 481 - TTTGACTTCAAGCTGTTGAAGTGCTAATAAAACACTAGACAAATAACAATTGTTATCAGC - 24 54 0 
-FDFKLLKC* * N T R Q I T I V I S 

- L T S S C * SANKTLDK* QLLSA 

*LQAVEVLIKH*TNNNCYQP 
24541 - CCATTTAATTGAAGTTAAACCACCAACTTGAGGAAATTTCCATTTCTTTGTGTGGTTTAA - 24 600 

- P F N * S*TTNLRKFPFLCVV* 
-HLIEVKPPT*GNFHFFVWFK 

I*LKLNHQLEBISI SLCGLK 
24601 - AGCAGACATGTACCTACCAAGAAAACTCTCATCAAGAGTATGGTAGTACTCGAAAGCTTC - 24 660 
-SRHV PTKKTLIKSMVVLE S F 

- ADMYLPRKLSSRVW* YSKAS 

QTCTYQENSHQEYGSTRKLH 
24661 - ACTACGTAGTGTGTCATCACTAGGTAGTACAAAGAAAGTCTTACCCTCATGATTTACATG - 24720 

- T T * C V I T R * YKESLTLMI YM 
-LRSVSSLGSTKKVLPS* FT* 

YVVCHH*VVQRKSYPHDLHE 
24721 - AGGTTTAATTTTTGTAACATCAGCACCATCCAAGTATGTTGQACCAAACTGCTGTCCATA - 24780 
-RFNFCNISTIQVCWTKLLSI 
-GLIFVTSAPSKYVGPNCCPY 
V*FL*HQHHPSMLDQTAVHM 
24781 - TGTCATAGACATATCCACAAGCTGTGTGTGGAGATTAGTGTTGTCCACAGTTGTGAACAC - 24 840 
-CHRHIHKLCVEISVVHSCEH 
-VIDISTSCVWRLVLSTVVNT 
S*TYPQAVCGD*CCPQL*TL 
24841 - TTTTATAGTCTTAACCTCCCGCAGGGATAAGAGACTCTTTAGTTTGTCAAGTGAAAGAAC - 24 900 
-FYSLNLPQG^ETL* FVK* KN 
-FIVLTSRRDKRLFSLSSERT 
L*S*PPAGIRDSLVCQVKEP 
24901 - CTCACCGTCAAGATGAAACTCGACGGGGCTCTCCAGAGTGTGGTACACAATTTTGTCACC - 24 960 
-LTVKMKLDGALQSVVHKJFVT 
-SPSR*NSTGLSRVWYTILSP 
HRQDETRRGSPECGTQFCHH 
24961 - ACGCTTAAGAAATTCAACACCTAACTCTGTACGCTGTCCTGAATAGGACCAATCTCTGTA - 25020 
-TLKK FNT * LCTLS * IGPISV 
-RLRNSTPNSVRCPE* D Q S L * 
A*EIQHLTLYAVLNRTNLCK 
25021 - AGAGCCAGCCAAAGAAACTGTTTCTACAAAGTGCTCCTCAGATGTCTTTGATGACGAAGT - 25080 
-RASQRNCFYKVLLRCL* * RS 
-EPAKETVSTKCSSDVFDDEV 
SQPKKLFLQSAPQMSLMTK* 
25081 - GAGGTATCCATTATATGTAGTAACAGGATCTGGTGATGATACTGACACTACGGCAGGAGC - 25140 

- E V S I ICSNSIW**Y*HYGRS 
-RYPLYVVTASGDDTDTTAGA 

GIHYM* *QHLVMILTLRQEL 
25141 - TTTAAGAGAACGCATACAGCGCGCAGCCTCTTCAAGATTAAAACCATGTGTCACATAACC - 25200 
-FKRT HTARSLFKIKTMCHIT 
-LRERIQRAASSRLKPCVT* P 

*ENAYSAQPLQD*NHVSHNQ 
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25201 - AATTGGCATTGTGACAAGCGGCTCATTTAGAGAGTTCAGCTTCGTAATAATAGAAGCTAC - 252 60 
-NWHCDKRLI*RVQLRNNRSY 
-IGIVTSGSFREFSFVIIEAT 
LAL*QAAHLESSAS****KLQ 

252 61 - AGGCTCTTTACTAGTATAAAAGAAGAATCGGACACCATAGTCAACGATGCGCTCTTGAAT - 2532 0 

- R L F T S I KEESDTIVNDALLN 
-GSLLV*KKNRTP*STMPS*I 

ALY*YKRRIGHHSQRCPLEF 
25321 - TTTAATTCCTTTATACTTACGTTGGATGGTTGCCATTATGGCTCTAACATCCATGCATAT - 25380 
-FN SFILTLDGCHYGSN I H AY 
-LIPLYLRWMVAIMALTSMHI 
*FLYTYVGWLPLWL*HPCI* 
25381 - AGGCATTAATTTTCTTGTCTCTTCAGCATGAGCAAGCATTTCTCTCAAATTCCAGGATAC - 25440 
-RH*FSCLFSMSKHFSQIPGY 
-GINFLVSSA*ASISLKFQDT 
ALIFLSLQHEQAFLSN SRIQ 
25441 - AGTTCGTAGAATCTCTTCCTTAGGATTAGGTGCTTCTGAAGGTAGTACATAAAATGCAGA - 25500 
-SS*NLFLSIRCF*R*YIKCR 
-VPRISSLAL* GASEGST*NAD 
FLESLP*H*VLLKVVHKMQI 
25501 - TTTGCATTTCTTAAGAGCAGTCTTAGCTTCCTCAAGTGTATAACCAGCACATCCTTGTCG - 25560 
-FAFLKS SLS FLKC ITS TSLS 

- LHFLRAVLASSSV*PAHPCP 

CIS*EQS*LPQVYNQHILVQ 
25561 - AGGGTACGTGGTTATATACTCATCAACTGGCACTTTCTTCAAAGCTCTTGAGAGCATCTC - 25620 

- R V R G Y I LINWHFLQSS * EHL 
-GYVVIYSSTGTFFKALESIS 

GTWLYTHQLALSSKLLRASQ 
25621 - AGTAGTGCCACCAGCCTTTTTGGAGGGTATTACAACACAAGTGATATCACCACTAGTGAT - 25680 

- S SATSLFGGYYNTSDI TTSD 
-VVPPAFLEGITTQVISPLVI 

*CHQPFWRVLQHK*YHH*** 
2 5681 - AACATCACCTACCATGTAAGGTGCATCCTTCTCAAGGAAAGACATATCTTCACCTCTAAG - 25740 
-NITYHVRCILLiKERHI FTSK 

- TSPTM*GASFSRKDISSPLS 

HHLPCKVHPSQGKTYLHL*A 
25741 - CATGTTCTGAGAATCATGGTAAAGCTTACCATTGATATCAGCAAACAAGAGTAACTTATT - 25800 
-HVLRIMVKLTIDISKQE* LI 
-MF*ESW*SLPLI SANKSNLL 
CSENHGKAYH*YQQTRVTYW 
25801 - GGTAAGAAACTTAGTTTCTTCCAGTGTTGTGGTAACCTCATCAATGCAGGCCTTAATTTT - 25860 
-GKKLSFFQCCGNLINAGLNF 
-VRNEVSSSVVVTSSMQALIF 
*ET*FLPVLW*PHQCRP*FL 
258 61 - TGGCTTCACATCGACAGGCTTCTGTACGACAGATTTCTCCTCAGTTTTGGAATCTTCTGT - 25920 
-WLHIDRLLYDRFLLSFGI FC 

- GFTSTGFC. TTDFSSVLESSV 

ASHRQASVRQISPQFWNLLC 
25921 - GTTTGGTGGCTCCTCTTGTTTAGGTGCTTCCACTCTAGGCTTCAGGTTATCAAGATAATC - 25980 
-VWWLLLFRCFHSRLQVIKII 
-FGGSSCLGASTLGFRLSR* S 
LVAPLV*VLPL*ASGYQDNP 
2 5981 - CATGACAACCTGCTCATAAAGAGCTTTGTCATTGACTGCAATATAAACCTGTGTACGAAC - 26040 
-HDNLLIKSFVI DCNINLCTN 
-MTTCS *RALSLTAI * T C V R T 
*QPAHKELCH*LQYKPVYEP 
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26041 - CGTCTGCACGCACACTTGTAAAGACTGAAGTGGTTTAGCACCAAATATGCCTGCTGACAA - 26100 
-RLHAHL* RLKWFSTKYAC*Q 
-VCTHTCKD*SGLAPNMPADN 
SARTLVKTEVV^HQICLLTT 
26101 - CAATGGTGCAAGTAAGATGTCCTGTGAATTGAAATTTTCATATGCTGCCTTAAGAAGCTG - 26160 
-QWCK*DVL*IEIFICCLKKL 
-NGASKMSCELKFSYAALRSW 
MVQVRCPVN*NFHMLP*EAG 
26161 - GATGTCCTCACCTGCATTTAGGTTAGGTCCAACAACATGCAGACACTTCTTAGCAAGATT - 26220 
-DVLTCI * VRSNNMQTLLSKI 
-MSSPAFRLGPTTCRHFLARL 
CPHLHLG*VQQHADTS * QDY 
26221 - ATGTCCAGAAAGCAAACAAGACCCTCCTACTGTAAGAGGGCCATTTAGCTTAATGTAATC - 26280 
-MSRKQTRPSYCKRAI * LNV I 
-CPESKQDPPTVRGPFSLM*S 
VQKANKTLLL*EGHLA* CNH 
26281 - ATCACTCTCCTTTTGCATGGCACCATTGGTTGCCTTGTTGAGTGCACCTGCTACACCACC - 2634 0 
-ITLLLHGTIGCLVECTCYTT 
-SLSFCMAPLVALLSAPATPP 
HSPFAWHHWLPC*VHLLHHH 
2 6341 - ACCATGTTTCAGGTGTATGTTAGCAGCATTTACAATCACCATAGGATTAGCACTTTGTGC - 2640 0 
-TMFQVYVSS IYNHHRI STLC 
PCFRCMLAAFTI TIGLALCA 
HVSGVC*QHLQSP*D*HFVP 
26401 - CTCCTTAACGATGTCAACACATTTAATGGCAACATTGTCAGTAAGTTTTAAATAACCAGT - 26460 
-LLNDVNT FNGN I VSKF * ITS 
-SLTMSTHLMATLSVSFK^PV 
P * R C Q H I *WQHCQ*VLNNQ* 
2 64 61 - AAACTGATTAACTGGTTCTTCAGGTGTAGGTTCTGGTTCTGGCTCAATCTCTGATTGCTC - 2 652 0 
-KLINWFFRCRFW FWLNL* LL 
-N*LTGSSGVGSGSGSISDCS 
TD*LVLQV*VLVLAQS LIAQ 
26521 - AGTAGTATCATCCAGCCAGTCTTCCTCTTCTTCTTCCTCAACTCGAACTGTTTCAGCTGA - 26580 
-S SI I QPVFLFFFLNSNCFS* 
-VVSSSQSSSSSSSTRTVSAE 
*YHPASLPLLLPQLEL FQLR 
26581 - GGCACCAAATTCCAGAGGGAGACCTTGATAATCATCCTCTGTACCGTACTCATGTTCACA - 26640 
-GTKFQRETLI I ILCTVLMFT 
-APNSRGRP**SSSVPYSCSQ 
HQIPEGDLDKHPLYRTHVHR 
2 6641 - GGTTTCATCAATTTCTTCTTCCTCACACTCTGCATCGTCCTCTTCTTCCTCATCTGGAGG - 26700 

- G F I N FFFLTLCIVLFFLI WR 
-VSSISSSSHSASSSSSSSGG 

FHQFLLPHTLHRPLLP HLEG 
26701 - GTAAAAGGAACAATACATACGTGATGAAAAGTTTTCTTCACCAGCATCATCAAATAAGTA - 26760 
-VKGTIHT**KVFFTSIIK*V 

- *KEQYIRDEKFSSPASSNK* 

KRNNTYVMKSFLHQHHQISR 
2 67 61 - GAATGTAGCTACACTCCACTCATCAAGATCAATACCCATGTTGGTAAGGAGATCAGAAAC - 26820 
-ECSYTPLIKINTHVGKEIRN 
-NVATLHSSRSIPMLVRRSET 
M*LHSTHQDQYPCW*GDQKL 
2 6821 - TGGTTGTAAAGTCTTCACAACAGCCTCTGCTACAACACATGCAAACTCAGTAACTTCGGT - 26880 

- W L * S LHNSLCYNTCKLSNFG 
-GCKV FTTASATTHANSVTSV 

VVKS SQQPLLQHMQTQ* LRY 
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26881 - ACCGGATTCAACAGTGTAGACAGAGCACTTTTCATTAAGCACTTTGTCAACACGTTCATC - 26940 
-TGFNSVDRALFIKHFVNTFI 
-PDSTV* TBHFSLSTLSTRSS 
RIQQCRQSTFH*ALCQHVHQ 
26941 - AAGCTCAAATGTGATTCTCACATTCTTGTAACCTTGAACTTCCCAAACAGTATCTTCTCC - 27000 
-KLKCDSHILVTLNFPNSIFS 
-SSNVILTFL*P*TSQTVSSP 
A Q M * FSHSCNLELPKQYLLQ 
27001 - AAAGGTTACACCTTTAATTGGTGCACCCCCTTTTAAGCGAAAGACATTGTTTGTAGCCAG - 27060 
-KGYTFNWCTPF*AKDIVCSQ 
-KVTPLIGAPPFKRKTLFVAS 
RLHL*LVHPLLSERHCL*PV 
27061 - TAAACCAGGAGACAATGCGCAGTATTGTTCTTTGTCCTTAATCTCTAAGAGCATGAGGCC - 27120 
-*TRRQCAVLFFVLNL* EHEA 
-KPGDNAQYCSLSLISKSMRP 
NQETMRSIVLCP *SLRA*GH 
27121 - ATTTACACAGACTGGTGTGCCGACGATAGCTCCATTTGTGAAGCTATCAACGGGCGTCTC - 27180 
-IYTDWCADDSS ICEAI NGRL 
-FTQTGVPTIAPFVKLSTGVS 
LHRLVCRR*LHL*SYQRASR 
27181 - GAGTGCTTCGAGTTCACCGTTCTTGAGAACAACCTCCTCAGAGGTAAGTACTGTGTCATG - 27240 
-ECFE FTVLENNLLRGKYCVM 
-SASSSPFLRTTSSEVSTVSC 
VLRVHRS*EQPPQR*VLCHV 
27241 - TGAATCACCTTCAAGAAAGGTTACTTCTTTTGGTGCCTTAAGAGGCATGAGTAGTTGCAG - 27300 

- * I TFKKGYFFV3CLKRHE*IiQ 
-ESPSRKVTSFGALRGMSSCS 

NHLQERLLLI>VP*EA*VVAA 
27 301 - CTGCTCCTTGCCACGTATACACTGACGGTAAAGTCCCTTGCTTTGAGCGATGAAGACTTC - 273 60 
-LLLATYTLTVKSLALSDEDF 
-CSLPRIH*R*SPLL*AMKTS 
APCHVYTDGKVPCFER* RLH 
27361 - ACCTAAGTTGAGTGATCGCAACTTTGCGCCAGCGATAGTGACTTGATCAATGCACATTTC - 27420 

- T * V E * SQLCASDSDLINAHF 
-PKLSDRNFAPAIVT* SMHIS 

LS*VIATLRQR**LDQCTFR 
27 421 - GAGTGCCTTGTTAACAACATCAATGAAGCATTTTACACAATCCTTGATGTTATCTGAAGC - 27 480 
-ECLVNN IMEAFYT ILDVI * S 
-SALLTTSMKHFTQSLMLSEA 
VPC*QHQ*SILHNP*CYLKQ 
27481 - AACCTGTATTTGACCCTTGACGATGTCAAAAACACCTGTAATGAGAAATTTGAGAATCTC - 27540 
-NLYLTLDDVKN TCNEKFENL 
-TCI *PLTMSKTPVMRNLRIS 
PVFDP*RCQKHL**EI*ESP 
27 541 - CCAAGCATCCTTGAGAAATTCAACTCCTGCACTAAGTTTCGCCTCAATCCATTCAAAGAT - 27 600 
-PS ILEKFNSCTKFRLNPFKD 
-QASLRNSTPALSFAS I H S K I 
KHP*EIQLLH*VSPQSIQR* 
27 601 - AGGCCTGAGTTTTTCAACAGTAGTGCCCAAAAGATTAGACAACCACTGAGAAGTCTGTTG - 27 660 
-RPEFFNSSAQKIRQPIiRSLL 
-GLSFSTVVPKRLDNH*EVCC 
A*VFQQ*CPKD*TTTEKSVV 
27 661 - TACAAGACCACCAGTTACATATGCCATAATAATGACACTGTTGGTGAGCAGGTCTGAAGT - 27720 
-YKTTSYICHNNDTVGEQV* S 
-TRPPVTYAI IMTLLVSRSEV 
QDHQLHMP* **HCW*AGLKY 
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27721 - ATAAACCATGGCGTCGACAAGACGTAATGACTGTTCAGAAATACCATCAAGTATGGTGAC - 27780 
-INHGVDKT* *IjFRNTIKYGD 
- * TMASTRRNDCSEIPSSMVT 
KPWRRQDVMTVQKYHQVW*Q 

27781 - AGCTGCTCTTTGCAAATCAGGAATTGAGTGGTTTGCTGCATCAAGTGTGCGCGCAAAAAT - 27840 
-SCSLQIRN*VVCCIKCARKN 

-aalcksgiewfaassvraki 
llfanqelsgllhqvcaqkl 
27841 - tgatctgataacaccagcagcctgtgagggaaaaccacacagt?ggtgttaaaactgatct - 27900 

-*SDNTSSL*GKTTQWC*N*S 
-DLITPAACEGKPHSGVKTDL 
I * *HQQPVRENHTVVLKLIS 
27901 - CTGTTGTCCAATGTTCCAAGCACCTTTTACGGGCTTTCCCTTGGTAACTTTATAGTTACC - 27 960 
-LLSNVPSTFYGLSLGN F I V T 
-CCPMFQAPFTGFPLVTL*LP 
VVQCSKHLLRAFPW*I>YSYR 
27961 - GCAGGACTCAACAATGGTTTTGAAAGACTTGTAATGAAGACTCTTTATAGTGTCAATAAA - 28 020 
-AGLNNGFERLVI KTLYSVNK 
-QDSTMVLKDL* SRLFIVSIK 
RTQQWF*KTCNQDSL* C Q * R 
2 8021 - GGCACTTGTAGAAGCAGAGAAAGATGCCAAAATGATGGCAACCTCTTCATTCAAATGAAA - 28080 
-GTCRS RERCQN DGNL F IQMK 
-ALVEAEKDAKMMATS S F K * K 
HL*KQRKMPK*WQPLHSNEN 
2 8081 - ATCGCCAACAATGTTAATGTTAACACGTTCACGACTCAGTATCTCAAGGAGATCCTCATT - 2814 0 
-IANNVNVNTFTTQYLKEILI 
-SPTMLMLTRSRLSISRRSSF 
RQQC*C*HVHDSVSQGDPHS 
2 8141 - CAAGGTCTCCACATTGTCACCAGTAATGCCAGTATGGCCTGAGCCAATATCAGCACTAGC - 28200 
-QGLH IVTSNASMA*AN ISTS 
- KVSTLSPVMPVWPEPISALA 
RSPHCHQ*CQYGLSQYQH*H 
2 82 01 - ACGAGGAACCCAGTAGGCACGCTTATTATAGCAGCCAACATAGGCAAACACACAGCCTCC - 282 60 
-TRNPVGTLI IAANIGKHTAS 
-RGTQ*ARLL*QPT*ANTQPP 
EEPSRHAYYSSQHRQTHSLQ 
28261 - AAAACATCTAGTCCTACCTCCCTTGCGGAGTCGAGTTTCAATGTTTGAGTGGTTGTGATA - 28320 
-KTSSPTSLAESSFNV*VVVI 
-KHLVLPPLRSRVSMFEWL** 
N I *SYLPCGVEFQCLSGCDN 
28321 - ATCTGCAACACTATGCTCAGGTCCAATCTCTGGGTCTTGACAGGCAGGACATGGCATTTT ~ 2838 0 
-I CNTMLRSNLWVLTGRTWHF 
-SATLCSGPISGS *QAGHGIF 
LQHYAQVQSLGLDRQDMAFS 
2 8381 - CACTACAGCATTAGTAGGTAGGTACCCACATGTAGTAGGTCCTTCAATAACTAAATTTTC - 28440 
-HYSISR*VPTCSRSFNN* IF 
TTALVGRYPHVVGPSITKFS 
L Q H * * V G T H M * * V L Q * LNFQ 
2 84 41 - AGTGCCACAATGTTCACAAGTGGCTTTCAGAAAGTCGCACGTCTGCCATGAAACTTCATC - 2 85 00 
-SATMFTSGFQKVARLP*NFI 
-VPQCSQVAFRKSHVCHETSS 
CHNVHKWLSESRTSAMKLHR 
2 8501 - GCAATGATTACATTTCATCAAGGTAGACAAGTGCATATTGTTACACTCCTGTGGAGATGC - 2 8560 
"AMI TFHQGRQVH IVTLLWRC 
-Q*LHFIKVDKCILLHSCGDA 
NDYISSR* TSAYCYTPVEMQ 
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28561 - AACAGGGTACACAGAGCGTATACGCCCCATGAAACCCTCAGTCTTTTTCTTTTCAACACG - 28620 
-NRVHRAYTPHETLSLFLFNT 
-TGYTERIRPMKPSVFFFSTR 
QGTQSVYAP*NPQSFSFQHV 
2 8 621 - TGGTTGAATGACTTTGACTTTTGAGTTAAGAGGAAACACAAACTTTGGGCATTCCCCTTT - 28680 
-WLN D F D F * VKRKHKLWAFPF 
-G*MTLTFELRGNTNFGHSPL 
VE*L*LLS*EETQTLGIPL* 
28 681 - GAAAGTGTCAAATTTCTTGGCACTCTTAATTTCGAAGGGTGTCTGGTGCTCGTAGCTCTT - 28740 
-ESVKFLGTLN FEGCLV LVAL 
-KVSNFLALLISKGVWCS^liL 
KCQISWHS*FRRVSGARSSY 
28741 - ATCAGAGCGCTCAGTGAACCAGGCAATTTCATGCTCATGGTCACGGCAGCAGTAGACACC - 28800 
-IRALSEPGNFMLMVTAAVDT 
-SERSVNQAISCSWSRQQ*TP 
QSAQ*TRQFHAHGHGSSRHL 
28801 - TCTCTTCGACTCGATGTAATCAAGTTGTTCGGAAAGAGTGCACATTGACTTGCCCGCGCG - 2 88 60 
-SLRLDVIKLFGKSAH* LARA 
- LFDSM* SSCSERVHI DLPAR 
SSTRCNQVVRKECTLTCPRV 
28861 - TGCGAGAAAATCTTTGATGCAATCAAGAGGGTACCCATCTGGGCCACAGAAATTGTTGTC - 28920 
-CEKIFDAIKRVPIWATEIVV 
-ARKSLMQSRGYPSGPQKLLS 
RENL*CNQEGTHIiGHRNCCR 
28921 - GACATAGCGAGTGACTGCACCTCCATTGAGCTCACGAGTGAGTTCACGGAGTGCACCACT - 28980 
-DIASDCTSIELTSEFTECTT 
-T*RVTAPPLSSRVSSRSAPL 
HSE* LHLH*AHE*VHGVHHC 
28981 - GCCATGCTTAGTGTTCCAGTTTTGTTCATAATCTTCAATGGGATCAGTGCCAAGCTCGTC - 29040 
-AMLSVPVLFI I FNGISAKLV 
-PCLVFQFCS*SSMGSVPSSS 
HA*CSSFVHNLQWDQCQARH 
29041 - ACCTAAGTCATAAGACTTTAGATCGATGCCATAGCTATGACCACCGGCTCCCTTATTACC - 29100 
-T*VIRL* IDAIAMTTGSLIT 
-PKS*DFRSMP*L*PPAPLLP 
LSHKTLDRCHSYDHRLPYYR 
29101 - GTTCTTACGAAGAAGAACATTGCGGTATGCAATTGGGGTTTCGCCCACATGTGGCACGAG - 29160 
-VLTKKNIAVCNWGFAHMWHE 
-FLRRRTLRYAIGVSPTCGTS 
SYEEEHCGMQLGFRPHVARV 
29161 - TACTCCCAGTGTTATACCGCTACGACCGTACTGAATGCCGTCCATTTCTGCAACCAGCTC - 29220 
-YSQCYTATTVLNAVHFCNQL 
-TPSVIPLRPY*MPSISATSS 
LPVLYRYDRTECRPFLQPAQ 
29221 - AACGACCTTGTGGCCGTGATTGGTGCTTAAGGCATCAGAACGTTTAATGAACACATAGGG - 29280 
-NDLVAVIGA* GIRTFNEHI G 
-TTLWP*LVLKASERLMNT*G 
RPCGRDWCLRHQNV**THRA 
29281 - CTGTTCAAGCTGGGGCAGTACGCCTTTTTCCAGCTCTACTAGACCACAAGTGCCATTTTT - 2934 0 
-LFKLGQYAFFQLY * TTSAI F 
-CSSWGSTPFSSSTRPQVPFL 
VQAGAVRLFPALLDHKCBF* 
2 9341 - GAGGTGTTCACGTGCCTCCGATAGGGCCTCTTCCACAGAGTCCCCGAAGCCACGCACTAG - 29400 
-EVFTCLR* GLFHRVPEATH * 
-RCSRASDRASSTESPKPRTS 
GVHVPPIGPLPQSPRSHALA 
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29401 - CACGTCTCTAACCTGAAGGACAGGCAAACTGAGTTGGACGTGTGTTTTCTCGTTGACACC - 29460 
-HVSNLKDRQTELDVCFLVDT 
-TSLT*RTGKLSWTCVFSLTP 
RL*PEGQAN*VGRVFSR*HQ 

29461 - AAGAACAAGGCTCTCCATCTTACCTTTCGGTCACACCCGGACGAAACCTAGGTATGCTGA - 29520 
-KNKALHLTFRSHPDET * V C * 

- RTRLSILP FGHTRTKPRYAD 

EQGSPSYLSVTPGRNLGMLM 
2 9521 - TGATCGACTGCAACACGGACGAAACCGTAAGCAGTCTGCAGAAGAGGGACGAGTTACTCG - 29580 

- * S TATRTKP *AVCRRGTSYS 
-DRLQHGRNRKQSAEEGRVTR 

I DCNTDETVSSLQKRDELLV 
2 9581 - TTTCTTGTCAACGACAGTAAAATTTATTATTGTTTATACTGCGTAGGTGCACTAGGCATG - 2 9 640 
-FLVN DS KIYYCLYCVGALGM 
-FLSTTVKFIIVYTA*VH*AC 
SCQRQ*NLLLFILRRCTRHA 
2 9641 - CAGCCGAGCGACAGCTACACAGATTTTAAAGTTCGTTTAGAGAACAGATCTACAAGAGAT - 2 9700 
-QPSDSYTDFKVRLENRSTRD 
-SRATATQI LKFV*RTDLQEI 
AERQLHRF* SSFREQIYKRS 
29701 - CGAGGTTGGTTGGGTTTTCCTGGGTAGGTAAAAACCTAATAT - 29742 
-RGWLAFPG * VKT * YX 
-EVGWLFLGR*KPNX 
RLVGFSWVGKNLIX 
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3. □ Claims Nos.: 

because they are dependent claims and are not drafted in accordance with the second and third sentences of Rule 6.4(a) 



Box No. Ill Observations where unity of invention is lacking (Continuation of item 3 of first sheet) 



This International Searching Authority found multiple inventions in this international application, as follows: 
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This International Searching Authority found multiple (groups of) inventions in this international 
application, as follows: 

1. Claims: 1,42-43,89-91,95436-13944^^ 
100,104-105,108-111,115-124,126-127,130-133,135 (partially) 

an isolated hSARS virus having the nucleotide sequence of SEQ ID NO :15 (that is, an isolated hSARS virus 
having China Center for Type Culture Collection Deposit Accession No :CCTCC-V200303);a host cell infected with 
the said hSARS virus; an immunogenic or vaccine formulation, a kit or pharmaceutical composition comprising the 
said hSARS virus; the said hSARS virus nucleic acid molecule or a complement thereof; a vaccine formulation 
comprising the virus nucleic acid molecule; an isolated virus polypeptide encoded by the said virus nucleic acid 
molecule; an immunogenic or vaccine formulation, a kit or a pharmaceutical composition comprising the virus 
polypeptide; an isolated antibody to the said hSARS virus or the virus polypeptide, a pharmaceutical composition 
comprising the virus antibody; the method for detecting the presence of the said virus, the virus nucleic acid, the virus 
polypeptide or the virus antibody; the method for detecting the presence of a first nucleic acid molecule derived from 
the said virus 

2. Claims: 2,19-21,36-37,101,112(completely);5-18,26-35,44-74,82-88,92-94,96-100,104-l 
(partially) 

an isolated hSARS virus comprising a nucleic acid molecule comprising the nucleotide sequence of SEQ ID 
NO :1; a host cell infected with the said hSARS virus; an immunogenic or vaccine formulation, a kit or a 
pharmaceutical composition comprising the said hSARS virus; the said SARS virus nucleic acid molecule or a 
complement thereof; an isolated virus polypeptide encoded by the said virus nucleic acid molecule; an immunogenic 
or vaccine formulation or a kit or a pharmaceutical composition comprising the said virus polypeptide; an isolated 
antibody to the said hSARS virus or the virus polypeptide; the method for detecting the presence of the said virus or 
the virus polypeptide or the virus antibody; the method for detecting the presence of a first nucleic acid molecule 
derived from the said virus; an isolated nucleic acid molecule comprising a nucleotide sequence of SEQ ED No :1 or a 
complement thereof; an isolated polypeptide encoded by the said nucleic acid molecule; an isolated antibody of the 
said polypeptide; an immunogenic or vaccine formulation, a kit or a pharmaceutical composition comprising the said 
nucleic acid or the polypeptide; the method for detecting the presence of the said polypeptide 

3. Claims: 3 9 22-23,38-39,102,113(completely);5-18,26-35,44-74,82-88,92-94,96-100,104- 
(partially) 

an isolated hSARS virus comprising a nucleic acid molecule comprising the nucleotide sequence of SEQ ID 
NO :11; a host cell infected with the said hSARS virus; an immunogenic or vaccine formulation, a kit or a 
pharmaceutical composition comprising the said hSARS virus; the said hSARS virus nucleic acid molecule or a 
complement thereof; an isolated virus polypeptide encoded by the said virus nucleic acid molecule; an immunogenic 
or vaccine formulation or a kit or a pharmaceutical composition comprising the said virus polypeptide; an isolated 
antibody to the said hSARS virus or the virus polypeptide; the method for detecting the presence of the said virus or 
the virus polypeptide or the virus antibody; the method for detecting the presence of a first nucleic acid molecule 
derived from the said virus; an isolated nucleic acid molecule comprising a nucleotide sequence of SEQ ID No :11 or 
a complement thereof; an isolated polypeptide encoded by the said nucleic acid molecule; an isolated antibody of the 
said polypeptide; an immunogenic or vaccine formulation, a kit or a pharmaceutical composition comprising the said 
nucleic acid or the polypeptide; the method for detecting the presence of the said polypeptide 

4. Claims: 4,24-25,40-41,103,114(complete^ 
(partially) 

an isolated hSARS virus comprising a nucleic acid molecule comprising the nucleotide sequence of SEQ ID 
NO :13; a host cell infected with the said hSARS virus; an immunogenic or vaccine formulation, a kit or a 
pharmaceutical composition comprising the said hSARS virus; the said hSARS virus nucleic acid molecule or a 
complement thereof; an isolated virus polypeptide encoded by the said virus nucleic acid molecule; an immunogenic 
or vaccine formulation or a kit or a pharmaceutical composition comprising the said virus polypeptide; an isolated 
antibody to the said hSARS virus or the virus polypeptide; the method for detecting the presence of the said virus or 
the virus polypeptide or the virus antibody; the method for detecting the presence of a first nucleic acid molecule 
derived from the said virus; an isolated nucleic acid molecule comprising a nucleotide sequence of SEQ ID No :13 or 
a complement thereof; an isolated polypeptide encoded by the said nucleic acid molecule; an isolated antibody of the 
said polypeptide; an immunogenic or vaccine formulation, a kit or a pharmaceutical composition comprising the said 
nucleic acid or the polypeptide; the method for detecting the presence of the said polypeptide 
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